Mapper¶

The Mapper class is the runner for the SpaHDmap model, providing a series of methods to execute various steps of the model.

class SpaHDmap.Mapper(section, results_path, rank=20, reference=None, ratio_pseudo_spots=5, scale_split_size=False, verbose=False)[source]¶

The Mapper class is a runner for the SpaHDmap model.

Parameters:

section (Union[STData, List[STData]]) – STData or List of STData containing the spatial objects for the sections.
results_path (str) – The path to save the results.
rank (int) – The rank of the NMF model.
reference (Optional[Dict[str, str]]) – Dictionary of query and reference pairs, e.g., {‘query1’: ‘reference1’, ‘query2’: ‘reference2’}. Only used for multi-section analysis.
ratio_pseudo_spots (int) – The ratio of pseudo spots to sequenced spots.
scale_split_size (bool) – Whether to scale the split size based on the scale rate. If True, split size will be adjusted based on the square root of the scale rate.
verbose (bool) – Whether to print the progress or not.

Example

>>> import SpaHDmap as hdmap
>>> sections = [hdmap.prepare_stdata(...)]  # List of STData objects
>>> rank = 20
>>> results_path = 'results'
>>> mapper = hdmap.Mapper(section=sections, results_path=results_path, rank=rank, verbose=True)
>>> mapper.run_SpaHDmap(save_score=True, visualize=True)

cluster(section=None, use_score='SpaHDmap', resolution=0.8, n_neighbors=50, joint=True, format='png', show=True)[source]¶

Perform clustering on sections.

Parameters:

section (Union[str, STData, List[Union[str, STData]], None]) – Section(s) to cluster. If None, uses all sections.
use_score (str) – Score type to use for clustering.
resolution (float) – Resolution parameter for Louvain clustering.
n_neighbors (int) – Number of neighbors for graph construction.
joint (bool) – Whether to cluster spots/pixels jointly across sections.
format (str) – Output format for visualization (‘jpg’, ‘png’, ‘pdf’).
show (bool) – Whether to display the plot using plt.show().

extract_spots(index, section=None, threshold=0.05, use_score='SpaHDmap_spot')[source]¶

Extract spot indices with high score in a specific embedding.

Parameters:

index (int) – The embedding index to extract from (0-based)
section (Union[str, STData, List[Union[str, STData]], None]) – Section(s) to extract from. If None, uses all sections in Mapper.
threshold (float) – Threshold value, spots above this value will be extracted.
use_score (str) – The score type to use.

Returns:

If single section: returns barcodes array directly If multiple sections: returns dictionary {section_name: barcodes}

Return type:

numpy.ndarray or dict

get_GCN_score(GMM_filter=True, save_score=False, use_ann=False, **kwargs)[source]¶

Get the smoothed GCN score for each section.

Parameters:

GMM_filter (bool) – Whether to filter low signal using Gaussian Mixture Model.
save_score (bool) – Whether to save the GCN score or not.
use_ann (bool) – Whether to use ANN for constructing the adjacency matrix or not.
**kwargs – Additional arguments for ANN (num_tree and n_jobs).

get_NMF_score(save_score=False)[source]¶

Perform NMF and normalize the results, or use existing metagene for transfer learning.

Parameters:: save_score (bool) – Whether to save the score or not.

get_SpaHDmap_score(save_score=False, filter_mask=True)[source]¶

Get the SpaHDmap scores for each section.

Parameters:

save_score (bool) – Whether to save the SpaHDmap scores or not.
filter_mask (bool) – Whether to filter the mask based on the scores or not.

get_VD_score(use_score='GCN')[source]¶

Perform Voronoi Diagram to get the score of each pixel.

Parameters:: use_score (str) – The type of embedding to be visualized.

load_metagene(result_path=None)[source]¶

Load existing metagenes for transfer learning.

Parameters:: result_path (Optional[str]) – Path to the results directory containing metagene files. If None, uses current results_path. Will load both ‘metagene_NMF.csv’ and ‘metagene.csv’ from this directory.

Notes

This method should be called before running get_NMF_score() to enable transfer learning. The loaded metagene_NMF will be used to calculate NMF scores via linear regression instead of performing standard NMF decomposition.

pretrain(save_model=True, load_model=True)[source]¶

Pre-train the SpaHDmap model based on the image prediction.

Parameters:

save_model (bool) – Whether to save the model or not.
load_model (bool) – Whether to load the model or not.

property pretrain_path: str¶: Get the pretrained model path.

recovery(gene, section=None, use_score='SpaHDmap')[source]¶

Recover gene expression and store in section.X dictionary.

Parameters:

gene (Union[str, List[str], None]) – Gene name(s) to recover expression for, can be a single string or a list of strings
section (Union[str, STData, List[Union[str, STData]], None]) – Sections to recover gene expression for, if None, use all sections
use_score (str) – Score type to use for gene expression recovery.

run_SpaHDmap(save_score=False, save_model=True, load_model=True, visualize=True, format='png', repeat_times=1, use_ann=False, **kwargs)[source]¶

Run the complete SpaHDmap pipeline.

Parameters:

save_score (bool) – Whether to save computed scores as numpy arrays.
save_model (bool) – Whether to save model checkpoints.
load_model (bool) – Whether to load existing model checkpoints if available.
visualize (bool) – Whether to generate and save visualizations.
format (str) – Output format for visualizations (‘jpg’, ‘png’, ‘pdf’).
repeat_times (int) – Number of times to repeat the pipeline with different random initializations.
use_ann (bool) – Whether to use ANN for constructing the adjacency matrix or not.
**kwargs – Additional arguments for ANN (num_tree and n_jobs).

train(save_model=True, load_model=True)[source]¶

Train the SpaHDmap model based on the image prediction and spot expression reconstruction.

Parameters:

save_model (bool) – Whether to save the model or not.
load_model (bool) – Whether to load the model or not.

property train_path: str¶: Get the trained model path.

visualize(section=None, use_score='SpaHDmap', target='score', gene=None, index=None, format='png', crop=True, show=True)[source]¶

Visualize scores, clustering results, or gene expression.

Parameters:

section (Union[str, STData, List[Union[str, STData]], None]) – The section(s) to visualize. If None, uses all sections
use_score (str) – The type of score to visualize (e.g., ‘NMF’, ‘GCN’, ‘SpaHDmap’)
target (str) – What to visualize - either ‘score’, ‘cluster’, or ‘gene’
gene (Optional[str]) – Gene name to visualize when target=’gene’
index (Optional[int]) – For score visualization only - the index of embedding to show.
format (str) – Output format (‘jpg’, ‘png’, ‘pdf’).
crop (bool) – Whether to crop to mask region. If False, save full image size.
show (bool) – Whether to display the plot using plt.show().