Mapper

The Mapper class is the runner for the SpaHDmap model, providing a series of methods to execute various steps of the model.

class SpaHDmap.Mapper(section, results_path, rank=20, reference=None, ratio_pseudo_spots=5, scale_split_size=False, verbose=False)[source]

The Mapper class is a runner for the SpaHDmap model.

Parameters:
  • section (Union[STData, List[STData]]) – STData or List of STData containing the spatial objects for the sections.

  • results_path (str) – The path to save the results.

  • rank (int) – The rank of the NMF model.

  • reference (Optional[Dict[str, str]]) – Dictionary of query and reference pairs, e.g., {‘query1’: ‘reference1’, ‘query2’: ‘reference2’}. Only used for multi-section analysis.

  • ratio_pseudo_spots (int) – The ratio of pseudo spots to sequenced spots.

  • scale_split_size (bool) – Whether to scale the split size based on the scale rate. If True, split size will be adjusted based on the square root of the scale rate.

  • verbose (bool) – Whether to print the progress or not.

Example

>>> import SpaHDmap as hdmap
>>> sections = [hdmap.prepare_stdata(...)]  # List of STData objects
>>> rank = 20
>>> results_path = 'results'
>>> mapper = hdmap.Mapper(section=sections, results_path=results_path, rank=rank, verbose=True)
>>> mapper.run_SpaHDmap(save_score=True, visualize=True)
cluster(section=None, use_score='SpaHDmap', resolution=0.8, n_neighbors=50, joint=True, format='png', show=True)[source]

Perform clustering on sections.

Parameters:
  • section (Union[str, STData, List[Union[str, STData]], None]) – Section(s) to cluster. If None, uses all sections.

  • use_score (str) – Score type to use for clustering.

  • resolution (float) – Resolution parameter for Louvain clustering.

  • n_neighbors (int) – Number of neighbors for graph construction.

  • joint (bool) – Whether to cluster spots/pixels jointly across sections.

  • format (str) – Output format for visualization (‘jpg’, ‘png’, ‘pdf’).

  • show (bool) – Whether to display the plot using plt.show().

extract_spots(index, section=None, threshold=0.05, use_score='SpaHDmap_spot')[source]

Extract spot indices with high score in a specific embedding.

Parameters:
  • index (int) – The embedding index to extract from (0-based)

  • section (Union[str, STData, List[Union[str, STData]], None]) – Section(s) to extract from. If None, uses all sections in Mapper.

  • threshold (float) – Threshold value, spots above this value will be extracted.

  • use_score (str) – The score type to use.

Returns:

If single section: returns barcodes array directly If multiple sections: returns dictionary {section_name: barcodes}

Return type:

numpy.ndarray or dict

get_GCN_score(GMM_filter=True, save_score=False, use_ann=False, **kwargs)[source]

Get the smoothed GCN score for each section.

Parameters:
  • GMM_filter (bool) – Whether to filter low signal using Gaussian Mixture Model.

  • save_score (bool) – Whether to save the GCN score or not.

  • use_ann (bool) – Whether to use ANN for constructing the adjacency matrix or not.

  • **kwargs – Additional arguments for ANN (num_tree and n_jobs).

get_NMF_score(save_score=False)[source]

Perform NMF and normalize the results, or use existing metagene for transfer learning.

Parameters:

save_score (bool) – Whether to save the score or not.

get_SpaHDmap_score(save_score=False)[source]

Get the SpaHDmap scores for each section.

Parameters:

save_score (bool) – Whether to save the SpaHDmap scores or not.

get_VD_score(use_score='GCN')[source]

Perform Voronoi Diagram to get the score of each pixel.

Parameters:

use_score (str) – The type of embedding to be visualized.

load_metagene(result_path=None)[source]

Load existing metagenes for transfer learning.

Parameters:

result_path (Optional[str]) – Path to the results directory containing metagene files. If None, uses current results_path. Will load both ‘metagene_NMF.csv’ and ‘metagene.csv’ from this directory.

Notes

This method should be called before running get_NMF_score() to enable transfer learning. The loaded metagene_NMF will be used to calculate NMF scores via linear regression instead of performing standard NMF decomposition.

pretrain(save_model=True, load_model=True)[source]

Pre-train the SpaHDmap model based on the image prediction.

Parameters:
  • save_model (bool) – Whether to save the model or not.

  • load_model (bool) – Whether to load the model or not.

property pretrain_path: str

Get the pretrained model path.

recovery(gene, section=None, use_score='SpaHDmap')[source]

Recover gene expression and store in section.X dictionary.

Parameters:
  • gene (Union[str, List[str], None]) – Gene name(s) to recover expression for, can be a single string or a list of strings

  • section (Union[str, STData, List[Union[str, STData]], None]) – Sections to recover gene expression for, if None, use all sections

  • use_score (str) – Score type to use for gene expression recovery.

run_SpaHDmap(save_score=False, save_model=True, load_model=True, visualize=True, format='png', repeat_times=1, use_ann=False, **kwargs)[source]

Run the complete SpaHDmap pipeline.

Parameters:
  • save_score (bool) – Whether to save computed scores as numpy arrays.

  • save_model (bool) – Whether to save model checkpoints.

  • load_model (bool) – Whether to load existing model checkpoints if available.

  • visualize (bool) – Whether to generate and save visualizations.

  • format (str) – Output format for visualizations (‘jpg’, ‘png’, ‘pdf’).

  • repeat_times (int) – Number of times to repeat the pipeline with different random initializations.

  • use_ann (bool) – Whether to use ANN for constructing the adjacency matrix or not.

  • **kwargs – Additional arguments for ANN (num_tree and n_jobs).

train(save_model=True, load_model=True)[source]

Train the SpaHDmap model based on the image prediction and spot expression reconstruction.

Parameters:
  • save_model (bool) – Whether to save the model or not.

  • load_model (bool) – Whether to load the model or not.

property train_path: str

Get the trained model path.

visualize(section=None, use_score='SpaHDmap', target='score', gene=None, index=None, format='png', crop=True, show=True)[source]

Visualize scores, clustering results, or gene expression.

Parameters:
  • section (Union[str, STData, List[Union[str, STData]], None]) – The section(s) to visualize. If None, uses all sections

  • use_score (str) – The type of score to visualize (e.g., ‘NMF’, ‘GCN’, ‘SpaHDmap’)

  • target (str) – What to visualize - either ‘score’, ‘cluster’, or ‘gene’

  • gene (Optional[str]) – Gene name to visualize when target=’gene’

  • index (Optional[int]) – For score visualization only - the index of embedding to show.

  • format (str) – Output format (‘jpg’, ‘png’, ‘pdf’).

  • crop (bool) – Whether to crop to mask region. If False, save full image size.

  • show (bool) – Whether to display the plot using plt.show().