Tutorial 3: Design spatial omics experiment for consecutive breast cancer sections
Please download according demo data from following link and place it under the demo folder:
google drive: https://drive.google.com/drive/folders/1z1nk0sF_e25LKMyHxJVMtROFjuWet2G_?usp=drive_link
Please also download the checkpoint file for the pathology foundation model and place it under the checkpoints folder
Step 1: Preprocess the H&E image
Make sure the physical size of each pixel is 0.5 micron
[1]:
import sys
sys.path.append('..')
from s2omics.p1_histology_preprocess import histology_preprocess
prefix_list = ['../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g1/',
'../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g2/',
'../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g3/']
for prefix in prefix_list:
histology_preprocess(prefix, show_image=True)
Image loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g1/he-raw.jpg
Rescaling image (scale: 0.571)...
282 sec
../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g1/he-scaled.jpg
../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g1/he.jpg
Preprocessed H&E image saved!
Image loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g2/he-raw.jpg
Rescaling image (scale: 0.571)...
277 sec
../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g2/he-scaled.jpg
../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g2/he.jpg
Preprocessed H&E image saved!
Image loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g3/he-raw.jpg
Rescaling image (scale: 0.571)...
272 sec
../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g3/he-scaled.jpg
../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g3/he.jpg
Preprocessed H&E image saved!
Step 2: Quality control for all superpixels
Superpixels are 8 microns * 8 microns square-shaped pseudo cells
We use our new QC package HistoSweep for this procedure
[2]:
from s2omics.p2_superpixel_quality_control import superpixel_quality_control
save_folder_list = ['../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g1/S2Omics_output',
'../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g2/S2Omics_output',
'../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g3/S2Omics_output']
for (prefix, save_folder) in zip(prefix_list, save_folder_list):
superpixel_quality_control(prefix, save_folder, show_image=True)
Image loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g1/he.jpg
0 0
../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g1/S2Omics_output/pickle_files/shapes.pickle
[compute_metrics_memory_optimized] Current memory: 0.0452 GB; Peak memory: 1.0486 GB
[compute_low_density_mask] Current memory: 0.0012 GB; Peak memory: 0.0714 GB
Total selected for density filtering: 116905
✅ Entropy map saved as 'glcm_entropy_map_colored.png'
✅ Energy map saved as 'glcm_energy_map_colored.png'
✅ Homogeneity map saved as 'glcm_homogeneity_map_colored.png'
=== GLCM Metric Means ===
homogeneity energy entropy
0 0.811977 0.364315 0.393492
1 0.433210 0.087103 0.793905
2 0.585925 0.166482 0.641603
3 0.325897 0.040935 0.879235
=== Cluster Scores ===
Cluster 0: Score = 0.7828
Cluster 1: Score = -0.2736
Cluster 2: Score = 0.1108
Cluster 3: Score = -0.5124
=== Number of Observations per Cluster ===
Cluster 0: 1073
Cluster 1: 4918
Cluster 2: 4614
Cluster 3: 8222
Total: 18827
✅ Clustered texture map saved as 'cluster_labels_colored.png'
[run_texture_analysis] Current memory: 0.0014 GB; Peak memory: 2.8918 GB
[run_ratio_filtering] Current memory: 0.0011 GB; Peak memory: 0.0274 GB
(1212416,)
✅ Final masks saved in: HistoSweep_output
[generate_final_mask] Current memory: 0.0000 GB; Peak memory: 0.2926 GB
Running successfully!
../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g1/S2Omics_output/pickle_files/qc_preserve_indicator.pickle
Image loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g2/he.jpg
0 0
../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g2/S2Omics_output/pickle_files/shapes.pickle
[compute_metrics_memory_optimized] Current memory: 0.0466 GB; Peak memory: 1.0760 GB
[compute_low_density_mask] Current memory: 0.0012 GB; Peak memory: 0.0729 GB
Total selected for density filtering: 153677
✅ Entropy map saved as 'glcm_entropy_map_colored.png'
✅ Energy map saved as 'glcm_energy_map_colored.png'
✅ Homogeneity map saved as 'glcm_homogeneity_map_colored.png'
=== GLCM Metric Means ===
homogeneity energy entropy
0 0.461211 0.098785 0.757332
1 0.327859 0.042905 0.871905
2 0.632621 0.188482 0.604731
3 0.838028 0.349599 0.385862
=== Cluster Scores ===
Cluster 0: Score = -0.1973
Cluster 1: Score = -0.5011
Cluster 2: Score = 0.2164
Cluster 3: Score = 0.8018
=== Number of Observations per Cluster ===
Cluster 0: 5982
Cluster 1: 7926
Cluster 2: 4069
Cluster 3: 1547
Total: 19524
✅ Clustered texture map saved as 'cluster_labels_colored.png'
[run_texture_analysis] Current memory: 0.0012 GB; Peak memory: 2.9821 GB
[run_ratio_filtering] Current memory: 0.0012 GB; Peak memory: 0.0275 GB
(1250304,)
✅ Final masks saved in: HistoSweep_output
[generate_final_mask] Current memory: 0.0000 GB; Peak memory: 0.3017 GB
Running successfully!
../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g2/S2Omics_output/pickle_files/qc_preserve_indicator.pickle
Image loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g3/he.jpg
0 0
../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g3/S2Omics_output/pickle_files/shapes.pickle
[compute_metrics_memory_optimized] Current memory: 0.0466 GB; Peak memory: 1.0760 GB
[compute_low_density_mask] Current memory: 0.0012 GB; Peak memory: 0.0729 GB
Total selected for density filtering: 153432
✅ Entropy map saved as 'glcm_entropy_map_colored.png'
✅ Energy map saved as 'glcm_energy_map_colored.png'
✅ Homogeneity map saved as 'glcm_homogeneity_map_colored.png'
=== GLCM Metric Means ===
homogeneity energy entropy
0 0.473863 0.104302 0.771179
1 0.768603 0.391862 0.425878
2 0.629004 0.196726 0.613754
3 0.369296 0.048044 0.865714
=== Cluster Scores ===
Cluster 0: Score = -0.1930
Cluster 1: Score = 0.7346
Cluster 2: Score = 0.2120
Cluster 3: Score = -0.4484
=== Number of Observations per Cluster ===
Cluster 0: 5641
Cluster 1: 1923
Cluster 2: 5789
Cluster 3: 7664
Total: 21017
✅ Clustered texture map saved as 'cluster_labels_colored.png'
[run_texture_analysis] Current memory: 0.0012 GB; Peak memory: 2.9821 GB
[run_ratio_filtering] Current memory: 0.0012 GB; Peak memory: 0.0275 GB
(1250304,)
✅ Final masks saved in: HistoSweep_output
[generate_final_mask] Current memory: 0.0000 GB; Peak memory: 0.3017 GB
Running successfully!
../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g3/S2Omics_output/pickle_files/qc_preserve_indicator.pickle
Step 3: Histology feature extraction
[3]:
from s2omics.p3_feature_extraction import histology_feature_extraction
# down_samp_step: the down-sampling step,
# default = 10 refers to only extract features for superpixels whose row_index and col_index can both be divided by 10 (roughly 1:100 down-sampling rate).
# down_samp_step = 1 means extract features for every superpixel
for (prefix, save_folder) in zip(prefix_list, save_folder_list):
histology_feature_extraction(prefix, save_folder,
foundation_model='uni',
ckpt_path='../checkpoints/uni/',
device='cuda:0',
batch_size=32,
down_samp_step=10,
num_workers=4)
/data1/msyuan/anaconda3/envs/S2Omics/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdm
Histology foundation model loaded!
Foundation model name: uni
Start extracting histology feature embeddings...
Image loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g1/he.jpg
../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g1/S2Omics_output/pickle_files/num_patches.pickle
0%| | 0/384 [00:00<?, ?it/s]
Batch 0:
Shape of patches: torch.Size([32, 3, 224, 224])
Shape of positions[0]: torch.Size([32])
Content of positions[0][:10]: tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
Content of positions[1][:10]: tensor([ 0, 160, 320, 480, 640, 800, 960, 1120, 1280, 1440])
Shape of feature_emb: torch.Size([32, 197, 1024])
Shape of patch_emb: torch.Size([32, 1024, 14, 14])
100%|█████████▉| 383/384 [04:45<00:00, 1.27it/s]
Part 0 patch number: 12257
100%|██████████| 384/384 [04:46<00:00, 1.34it/s]
../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g1/S2Omics_output/pickle_files/uni_embeddings_downsamp_10_part_0.pickle
Histology foundation model loaded!
Foundation model name: uni
Start extracting histology feature embeddings...
Image loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g2/he.jpg
../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g2/S2Omics_output/pickle_files/num_patches.pickle
0%| | 0/395 [00:00<?, ?it/s]
Batch 0:
Shape of patches: torch.Size([32, 3, 224, 224])
Shape of positions[0]: torch.Size([32])
Content of positions[0][:10]: tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
Content of positions[1][:10]: tensor([ 0, 160, 320, 480, 640, 800, 960, 1120, 1280, 1440])
Shape of feature_emb: torch.Size([32, 197, 1024])
Shape of patch_emb: torch.Size([32, 1024, 14, 14])
100%|█████████▉| 394/395 [05:10<00:00, 1.27it/s]
Part 0 patch number: 12614
100%|██████████| 395/395 [05:11<00:00, 1.27it/s]
../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g2/S2Omics_output/pickle_files/uni_embeddings_downsamp_10_part_0.pickle
Histology foundation model loaded!
Foundation model name: uni
Start extracting histology feature embeddings...
Image loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g3/he.jpg
../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g3/S2Omics_output/pickle_files/num_patches.pickle
0%| | 0/395 [00:00<?, ?it/s]
Batch 0:
Shape of patches: torch.Size([32, 3, 224, 224])
Shape of positions[0]: torch.Size([32])
Content of positions[0][:10]: tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
Content of positions[1][:10]: tensor([ 0, 160, 320, 480, 640, 800, 960, 1120, 1280, 1440])
Shape of feature_emb: torch.Size([32, 197, 1024])
Shape of patch_emb: torch.Size([32, 1024, 14, 14])
100%|█████████▉| 394/395 [05:08<00:00, 1.28it/s]
Part 0 patch number: 12614
100%|██████████| 395/395 [05:09<00:00, 1.28it/s]
../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g3/S2Omics_output/pickle_files/uni_embeddings_downsamp_10_part_0.pickle
Step 4: Joint histology segmentation
[4]:
from s2omics.multiple_sections.p4_get_histology_segmentation import get_joint_histology_segmentation
get_joint_histology_segmentation(prefix_list, save_folder_list,
foundation_model='uni',
down_samp_step=10,
clustering_method='kmeans',
n_clusters=20)
Pickle loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g1/S2Omics_output/pickle_files/shapes.pickle
Pickle loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g1/S2Omics_output/pickle_files/qc_preserve_indicator.pickle
Loading histology feature embeddings for image 0...
Pickle loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g1/S2Omics_output/pickle_files/uni_embeddings_downsamp_10_part_0.pickle
Sucessfully loaded and normalized all histology feature embeddings!
Pickle loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g2/S2Omics_output/pickle_files/shapes.pickle
Pickle loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g2/S2Omics_output/pickle_files/qc_preserve_indicator.pickle
Loading histology feature embeddings for image 1...
Pickle loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g2/S2Omics_output/pickle_files/uni_embeddings_downsamp_10_part_0.pickle
Sucessfully loaded and normalized all histology feature embeddings!
Pickle loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g3/S2Omics_output/pickle_files/shapes.pickle
Pickle loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g3/S2Omics_output/pickle_files/qc_preserve_indicator.pickle
Loading histology feature embeddings for image 2...
Pickle loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g3/S2Omics_output/pickle_files/uni_embeddings_downsamp_10_part_0.pickle
Sucessfully loaded and normalized all histology feature embeddings!
2025-10-15 17:43:06,918 - harmonypy - INFO - Computing initial centroids with sklearn.KMeans...
2025-10-15 17:43:17,014 - harmonypy - INFO - sklearn.KMeans initialization complete.
2025-10-15 17:43:17,074 - harmonypy - INFO - Iteration 1 of 10
2025-10-15 17:43:20,632 - harmonypy - INFO - Iteration 2 of 10
2025-10-15 17:43:24,161 - harmonypy - INFO - Iteration 3 of 10
2025-10-15 17:43:27,401 - harmonypy - INFO - Iteration 4 of 10
2025-10-15 17:43:28,962 - harmonypy - INFO - Iteration 5 of 10
2025-10-15 17:43:30,433 - harmonypy - INFO - Iteration 6 of 10
2025-10-15 17:43:33,569 - harmonypy - INFO - Converged after 6 iterations
Start segmenting the histology image, clustering method: kmeans
../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g1/S2Omics_output/pickle_files/cluster_image.pickle
Segmentation image is stored at: ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g1/S2Omics_output/image_files/cluster_image_num_clusters_20.jpg
../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g2/S2Omics_output/pickle_files/cluster_image.pickle
Segmentation image is stored at: ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g2/S2Omics_output/image_files/cluster_image_num_clusters_20.jpg
../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g3/S2Omics_output/pickle_files/cluster_image.pickle
Segmentation image is stored at: ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g3/S2Omics_output/image_files/cluster_image_num_clusters_20.jpg
Step 5: Select best ROI for spatial omics experiment
[5]:
from s2omics.multiple_sections.p5_roi_selection_rectangle import roi_selection_for_multiple_sections
# fusion_weights: the weight of three scores, default=[0.33,0.33,0.33], the sum of three weights should be equal to 1 (if not they will be normalized)
# positive_prior, negative_prior: prior information about interested and not-interested histology clusters, default = [],[]
# prior_preference: the larger this parameter is, S2Omics will focus more on those interested histology clusters, default= 1
roi_selection_for_multiple_sections(prefix_list, save_folder_list,
down_samp_step=10,
roi_size=[1.5,1.5],
rotation_seg=6,
num_roi=1, #0 refers to automatiacally determine the number of ROI
fusion_weights=[0.33,0.33,0.33],
emphasize_clusters=[], discard_clusters=[],
prior_preference=1)
Pickle loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g1/S2Omics_output/pickle_files/shapes.pickle
Pickle loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g1/S2Omics_output/pickle_files/cluster_image.pickle
Pickle loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g2/S2Omics_output/pickle_files/shapes.pickle
Pickle loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g2/S2Omics_output/pickle_files/cluster_image.pickle
Pickle loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g3/S2Omics_output/pickle_files/shapes.pickle
Pickle loaded from ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g3/S2Omics_output/pickle_files/cluster_image.pickle
[(103, 119), (106, 119), (106, 119)]
Sampling ROI candidates...
100%|██████████| 3600/3600 [00:01<00:00, 3033.46it/s]
Current best ROI: [[[45, 31], [63, 31], [63, 49], [45, 49]]]
roi score: 0.7813556906243225
scale score: 0.5218310997663257
valid score: 0.9984555975339682
balance score: 0.915561715045757
Current number of ROIs is 1.
Find the best 1 ROI(s) with:
ROI score: 0.7813556906243225
Scale score: 0.5218310997663257
Coverage score: 0.9984555975339682
Balance score: 0.915561715045757
../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g1/S2Omics_output/roi_selection_detailed_output/circle_roi_size_1.5_1.5/prior_preference_1/best_roi.pickle
Best ROI on histology segmentation image is stored at ../demo/Tutorial_3_Consecutive_ROI_selection_breast/breast_cancer_g1/S2Omics_output/main_output/best_roi_on_histology_segmentations.jpg
[ ]: