Tutorial 4: TLS Heterogeneity in Clear-Cell Renal Cell Carcinoma
Tertiary lymphoid structures (TLS) are tumor-infiltrating lymphoid aggregates whose composition varies between patients and is associated with prognosis. This tutorial uses QueST to (1) train a niche embedding on 15 ccRCC Visium slices spanning 28 annotated TLS, (2) extract per-TLS embeddings, and (3) reveal three coherent TLS subtypes via UMAP.
[ ]:
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
import sys
import logging
import warnings
import numpy as np
import scanpy as sc
import matplotlib.pyplot as plt
[ ]:
import sys, os
import quest.utils as utils
from quest.trainer import QueSTTrainer
[ ]:
warnings.filterwarnings('ignore')
logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.WARNING)
1. Load data
We use 15 of the 18 ccRCC slices (3 frozen slices frozen_c_5 / c_2 / c_23 are excluded due to lower quality). The remaining slices contain 28 annotated TLS in obs['tls_group'], with 'NO_TLS' for non-TLS spots.
[ ]:
dataset = 'ccRCC'
data_path = '../data/ccRCC'
model_path = '../results/ccRCC/model/quest_model.pth'
embedding_folder = '../results/ccRCC/embedding'
sample_ids = [
'ffpe_c_2', 'ffpe_c_3', 'ffpe_c_4', 'ffpe_c_7', 'ffpe_c_20', 'ffpe_c_34',
'ffpe_c_36', 'ffpe_c_39', 'ffpe_c_45', 'ffpe_c_51', 'frozen_a_3', 'frozen_a_15',
'frozen_b_1','frozen_b_18', 'frozen_c_57',
]
adata_list = [sc.read_h5ad(f'{data_path}/{sid}.h5ad') for sid in sample_ids]
print(f'Loaded {len(adata_list)} slices, {sum(a.n_obs for a in adata_list)} spots total')
Loaded 15 slices, 50462 spots total
[ ]:
color_list = ['#210c52', '#ff7f0e', '#00bfc4', '#ebec23', '#d62728']
fig, axes = plt.subplots(nrows=3, ncols=5, figsize=(18, 11))
axes = axes.flatten()
for i, (sid, adata) in enumerate(zip(sample_ids, adata_list)):
n_grp = len(np.unique(adata.obs['tls_group']))
sc.pl.spatial(adata, color='tls_group', palette=color_list[:n_grp],
ax=axes[i], show=False, title=sid)
axes[i].set_xlabel(''); axes[i].set_ylabel('')
fig.tight_layout()
plt.show()
2. Train QueST
[ ]:
trainer = QueSTTrainer(
dataset=dataset, data_path=data_path,
sample_ids=sample_ids, adata_list=adata_list,
query_niches=None, query_sample_id=None,
model_path=model_path,
embedding_folder=embedding_folder,
epochs=15, save_model=True,
hvg=4000, min_count=None, normalize=True,
seed=2024,
)
We also provided pretrained QueST model checkpoint weights at https://cloud.tsinghua.edu.cn/d/d649dc24501b4958905e/. To skip training and use pretrained checkpoint, simiply put it at corresponding model path and comment the “trainer.train()” line.
[ ]:
trainer.train()
trainer.inference(ckpt_path=model_path)
computing 3-hop subgraph (ffpe_c_2): 100%|██████████| 4510/4510 [00:07<00:00, 593.38it/s]
computing 3-hop subgraph (ffpe_c_3): 100%|██████████| 4755/4755 [00:08<00:00, 590.19it/s]
computing 3-hop subgraph (ffpe_c_4): 100%|██████████| 3829/3829 [00:05<00:00, 653.81it/s]
computing 3-hop subgraph (ffpe_c_7): 100%|██████████| 4975/4975 [00:08<00:00, 608.32it/s]
computing 3-hop subgraph (ffpe_c_20): 100%|██████████| 4948/4948 [00:07<00:00, 674.69it/s]
computing 3-hop subgraph (ffpe_c_34): 100%|██████████| 3585/3585 [00:05<00:00, 671.18it/s]
computing 3-hop subgraph (ffpe_c_36): 100%|██████████| 3206/3206 [00:04<00:00, 672.89it/s]
computing 3-hop subgraph (ffpe_c_39): 100%|██████████| 4940/4940 [00:07<00:00, 626.97it/s]
computing 3-hop subgraph (ffpe_c_45): 100%|██████████| 4562/4562 [00:07<00:00, 627.85it/s]
computing 3-hop subgraph (ffpe_c_51): 100%|██████████| 4359/4359 [00:06<00:00, 665.88it/s]
computing 3-hop subgraph (frozen_a_3): 100%|██████████| 1310/1310 [00:01<00:00, 669.54it/s]
computing 3-hop subgraph (frozen_a_15): 100%|██████████| 1264/1264 [00:01<00:00, 675.65it/s]
computing 3-hop subgraph (frozen_b_1): 100%|██████████| 1949/1949 [00:02<00:00, 673.54it/s]
computing 3-hop subgraph (frozen_b_18): 100%|██████████| 1186/1186 [00:02<00:00, 429.84it/s]
computing 3-hop subgraph (frozen_c_57): 100%|██████████| 1084/1084 [00:01<00:00, 659.40it/s]
training: 100%|██████████| 15/15 [12:14<00:00, 48.96s/epoch]
3. Extract per-TLS embeddings
[ ]:
tls_emb_dict = utils.get_subgraph_embedding(
trainer.model, trainer.adata_list, trainer.feature_list, trainer.edge_ind_list,
group_key='tls_group', exclude_values=['NO_TLS'],
)
TLS_GROUPS = {
'Group A': [f'TLS_{i}' for i in [11, 12, 13, 17, 18, 21, 22]],
'Group B1': [f'TLS_{i}' for i in [1, 2, 4, 10, 14, 15, 16, 23, 34]],
'Group B2': [f'TLS_{i}' for i in [3, 5, 6, 7, 8, 9, 19, 20, 24, 25, 26, 27]],
}
tls_adata = utils.wrap_subgraph_dict(tls_emb_dict, TLS_GROUPS)
4. Visualize the TLS embedding space
Hierarchical clustering of the 28 TLS embeddings yields three coherent subtypes:
Group A (7 TLS) — stromal / fibrotic, associated with poor prognosis;
Group B1 (9 TLS) — B-cell mature, GC-like;
Group B2 (12 TLS) — inflamed / IFN-γ active.
[ ]:
utils.plot_niche_umap(
tls_adata, color='group', figsize=(6, 5), size=100,
palette={'Group A': '#d7263d', 'Group B1': '#39c6d6', 'Group B2': '#23ce6b'},
frameon=True, edge_color='white', linewidths=0.5,
hide_ticks=True, hide_spines=True,
invert_x=True, invert_y=True,
title='TLS subtype',
)