Mapper¶
The Mapper class is the runner for the SpaHDmap model, providing a series of methods to execute various steps of the model.
- class SpaHDmap.Mapper(section, results_path, rank=20, reference=None, ratio_pseudo_spots=5, scale_split_size=False, verbose=False)[source]¶
The Mapper class is a runner for the SpaHDmap model.
- Parameters:
section (
Union[STData,List[STData]]) – STData or List of STData containing the spatial objects for the sections.results_path (
str) – The path to save the results.rank (
int) – The rank of the NMF model.reference (
Optional[Dict[str,str]]) – Dictionary of query and reference pairs, e.g., {‘query1’: ‘reference1’, ‘query2’: ‘reference2’}. Only used for multi-section analysis.ratio_pseudo_spots (
int) – The ratio of pseudo spots to sequenced spots.scale_split_size (
bool) – Whether to scale the split size based on the scale rate. If True, split size will be adjusted based on the square root of the scale rate.verbose (
bool) – Whether to print the progress or not.
Example
>>> import SpaHDmap as hdmap >>> sections = [hdmap.prepare_stdata(...)] # List of STData objects >>> rank = 20 >>> results_path = 'results' >>> mapper = hdmap.Mapper(section=sections, results_path=results_path, rank=rank, verbose=True) >>> mapper.run_SpaHDmap(save_score=True, visualize=True)
- cluster(section=None, use_score='SpaHDmap', resolution=0.8, n_neighbors=50, joint=True, format='png', show=True)[source]¶
Perform clustering on sections.
- Parameters:
section (
Union[str,STData,List[Union[str,STData]],None]) – Section(s) to cluster. If None, uses all sections.use_score (
str) – Score type to use for clustering.resolution (
float) – Resolution parameter for Louvain clustering.n_neighbors (
int) – Number of neighbors for graph construction.joint (
bool) – Whether to cluster spots/pixels jointly across sections.format (
str) – Output format for visualization (‘jpg’, ‘png’, ‘pdf’).show (
bool) – Whether to display the plot using plt.show().
- extract_spots(index, section=None, threshold=0.05, use_score='SpaHDmap_spot')[source]¶
Extract spot indices with high score in a specific embedding.
- Parameters:
index (
int) – The embedding index to extract from (0-based)section (
Union[str,STData,List[Union[str,STData]],None]) – Section(s) to extract from. If None, uses all sections in Mapper.threshold (
float) – Threshold value, spots above this value will be extracted.use_score (
str) – The score type to use.
- Returns:
If single section: returns barcodes array directly If multiple sections: returns dictionary {section_name: barcodes}
- Return type:
- get_GCN_score(GMM_filter=True, save_score=False, use_ann=False, **kwargs)[source]¶
Get the smoothed GCN score for each section.
- get_NMF_score(save_score=False)[source]¶
Perform NMF and normalize the results, or use existing metagene for transfer learning.
- Parameters:
save_score (
bool) – Whether to save the score or not.
- get_SpaHDmap_score(save_score=False, filter_mask=True)[source]¶
Get the SpaHDmap scores for each section.
- get_VD_score(use_score='GCN')[source]¶
Perform Voronoi Diagram to get the score of each pixel.
- Parameters:
use_score (
str) – The type of embedding to be visualized.
- load_metagene(result_path=None)[source]¶
Load existing metagenes for transfer learning.
- Parameters:
result_path (
Optional[str]) – Path to the results directory containing metagene files. If None, uses current results_path. Will load both ‘metagene_NMF.csv’ and ‘metagene.csv’ from this directory.
Notes
This method should be called before running get_NMF_score() to enable transfer learning. The loaded metagene_NMF will be used to calculate NMF scores via linear regression instead of performing standard NMF decomposition.
- pretrain(save_model=True, load_model=True)[source]¶
Pre-train the SpaHDmap model based on the image prediction.
- recovery(gene, section=None, use_score='SpaHDmap')[source]¶
Recover gene expression and store in section.X dictionary.
- Parameters:
gene (
Union[str,List[str],None]) – Gene name(s) to recover expression for, can be a single string or a list of stringssection (
Union[str,STData,List[Union[str,STData]],None]) – Sections to recover gene expression for, if None, use all sectionsuse_score (
str) – Score type to use for gene expression recovery.
- run_SpaHDmap(save_score=False, save_model=True, load_model=True, visualize=True, format='png', repeat_times=1, use_ann=False, **kwargs)[source]¶
Run the complete SpaHDmap pipeline.
- Parameters:
save_score (
bool) – Whether to save computed scores as numpy arrays.save_model (
bool) – Whether to save model checkpoints.load_model (
bool) – Whether to load existing model checkpoints if available.visualize (
bool) – Whether to generate and save visualizations.format (
str) – Output format for visualizations (‘jpg’, ‘png’, ‘pdf’).repeat_times (
int) – Number of times to repeat the pipeline with different random initializations.use_ann (
bool) – Whether to use ANN for constructing the adjacency matrix or not.**kwargs – Additional arguments for ANN (num_tree and n_jobs).
- train(save_model=True, load_model=True)[source]¶
Train the SpaHDmap model based on the image prediction and spot expression reconstruction.
- visualize(section=None, use_score='SpaHDmap', target='score', gene=None, index=None, format='png', crop=True, show=True)[source]¶
Visualize scores, clustering results, or gene expression.
- Parameters:
section (
Union[str,STData,List[Union[str,STData]],None]) – The section(s) to visualize. If None, uses all sectionsuse_score (
str) – The type of score to visualize (e.g., ‘NMF’, ‘GCN’, ‘SpaHDmap’)target (
str) – What to visualize - either ‘score’, ‘cluster’, or ‘gene’gene (
Optional[str]) – Gene name to visualize when target=’gene’index (
Optional[int]) – For score visualization only - the index of embedding to show.format (
str) – Output format (‘jpg’, ‘png’, ‘pdf’).crop (
bool) – Whether to crop to mask region. If False, save full image size.show (
bool) – Whether to display the plot using plt.show().