API reference #

Returns:

Nothing

stampede.pp#

preprocessing functions

stampede.pp.binarize(adata, verbose=True)#

Binarize the values in adata.X

Parameters:

adata (AnnData) – adata object
verbose (bool) – provide written feedback

Return type:

Returns:

Nothing, updates adata.layers and adata.X

stampede.pp.cell_qc_postfilter(adata)#

Compute metadata after filtering

Parameters:: adata (AnnData) – an adata object
Return type:: None
Returns:: Nothing, updates adata.obs

stampede.pp.combine_obs_columns(adata, columns, column_name, delim='_')#

Create a new column in adata.obs by combining all columns with the delimiter.

Parameters:

adata (AnnData) – an adata object
columns (list) – a list of columns in adata.obs to combine
column_name (str) – the name for the new column
delim (str) – the delimiter to use while joining the columns

Returns:

Nothing, updates adata.obs

stampede.pp.detection_rates(adata, column, normalize=True)#

Calculate gene detection rates per group in the specified column of adata.obs.

Parameters:

adata (AnnData) – adata object
column (str) – column in adata.obs with groups to compare
normalize (bool) – normalize detection rates for sample quality

Return type:

DataFrame

Returns:

a dataframe with normalized gene detection rates

stampede.pp.dim_red(adata, n_dims=50, use_genes=None, key_added='X_svd', random_state=42)#

Dimensionality reduction using Term Frequency Latent Semantic Indexing.

Parameters:

adata (AnnData) – adata object
n_dims (int) – number of dimensions to produce
use_genes (str) – Boolean column in adata.var with True for genes to be used in dimensionality reduction.
key_added (str) – key in adata.obsm for function output
random_state (int) – random seed value

Return type:

Returns:

Nothing, updates adata.obsm and adata.uns

stampede.pp.filter_cells(adata, falsecode_max=5, negprobe_max=3, ntranscript_min=0, ntranscript_max=inf, area_min=25, area_max=100, filter_columns=None, filter_internalqc=False, verbose=True)#

Filter adata.obs by a set of qc_params.

Parameters:

adata (AnnData) – adata object
falsecode_max (int) – maximum number of false codes the cell may have
negprobe_max (int) – maximum number of negative probes the cell may have
ntranscript_min (int) – minimum number of transcripts the cell must have
ntranscript_max (int) – maximum number of transcripts the cell must have
area_min (int) – minimum area (in pixels) the cell must have
area_max (int) – maximum area (in pixels) the cell must have
filter_columns (list) – a list of additional columns to filter by. Columns by (convertible to) boolean, where False values are removed.
filter_internalqc (bool) – filter by columns qcCellsPassed and qcFlagsFOV.
verbose (bool) – provide written feedback

Return type:

AnnData

Returns:

the filtered adata object

stampede.pp.filter_edges(adata, all_edges=0, left=0, top=0, right=0, bottom=0, slide=None, verbose=True)#

Filter cells based on their distance to one or more edges of its FOV. Uses the largest distance per edge.

Parameters:

adata – adata object
all_edges (int) – minimum distance from any edge in pixels
left (int) – minimum distance from the left edge in pixels (x = xmin + left)
top (int) – minimum distance from the top edge in pixels (y = ymin + top)
right (int) – minimum distance from the right edge in pixels (x = xmax - right)
bottom (int) – minimum distance from the bottom edge in pixels (y = ymax - bottom)
slide (int) – which slide to filter (default: all)
verbose (bool) – provide written feedback

Returns:

the filtered adata object

stampede.pp.filter_genes(adata, ncell_min=0, ncell_max=inf, ntranscript_min=0, ntranscript_max=inf, filter_columns=None, verbose=True)#

Filter adata.var by a set of qc_params.

Parameters:

adata (AnnData) – adata object
ncell_min (int) – minimum number of cells the gene is found in.
ncell_max (int) – maximum number of cells the gene is found in.
ntranscript_min (int) – minimum number of transcripts the gene must have.
ntranscript_max (int) – maximum number of transcripts the gene must have.
filter_columns (str | list) – a list of additional columns to filter by. Columns by (convertible to) boolean, where False values are removed.
verbose (bool) – provide written feedback

Return type:

AnnData

Returns:

the filtered adata object

stampede.pp.gene_qc(adata, mult=1, noise_threshold=None, overwrite=True)#

Add QC parameters to adata.var.

About the Signal-to-noise filter:

Approach from https://doi.org/10.1038/s41467-025-64990-y Wang et al. “Systematic benchmarking of imaging spatial transcriptomics platforms in FFPE tissues” Nat Com, 2025.

Calculate the mean expression and standard deviation of the negative control probes. Flag genes with average expression < mean + mult* x STD of ctrl probes.

*the paper used mult=2

Parameters:

adata (AnnData) – an adata object
noise_threshold (float | Iterable) – manually specify the mimimum mean_Transcript threshold. If None, use the filter specified above.
mult (int | float) – if noise_threshold is None, mult is used in the noise threshold computation specified above.
overwrite (bool) – overwrite existing qc columns

Return type:

Returns:

Nothing, updates adata.var

stampede.pp.gene_qc_postfilter(adata)#

Compute metadata after filtering

Parameters:: adata (AnnData) – an adata object
Return type:: None
Returns:: Nothing, updates adata.var

stampede.pp.knn_count_smoothing(adata, layer='binary', layer_added=None, neighbors_key='neighbors', verbose=True)#

For each cell, replace its gene vector with the average of its KNN neighborhood.

Runs sc.pp.neighbors if it has not run. See https://scanpy.readthedocs.io/en/stable/api/generated/scanpy.pp.neighbors.html

Parameters:

adata (AnnData) – adata object
layer (str) – name of the adata layer to use for smoothing
layer_added (str) – key in adata.layers for function output (default: “KNN_binary_mean”)
neighbors_key (str) – See sc.pp.neighbors for details
verbose (bool) – provide written feedback

Return type:

Returns:

Nothing, updates adata.layers and adata.X

stampede.pp.pseudobulk(adata, column, layer='binary')#

Generate a pseudobulk table (genes x samples) for all samples in the sample_column and the cluster in the cluster_column, if specified.

Parameters:

adata (AnnData) – adata object
column (str) – column in adata.obs with groups to compare
layer (str) – name of the adata layer to aggregate

Return type:

DataFrame

Returns:

a dataframe with summed layer values per sample

stampede.pp.slide_qc(adata, slides, add_cols=None, data_dir=None)#

Use the fov_positions file to create a dataframe with metadata columns per slide and fov, and store this in adata.uns[“fov_metadata”]. Additional adds columns to adata.obs reflecting the distance from the cell to the camera’s FOV edge.

Parameters:

adata (AnnData) – adata object generated using the slides dict
slides (dict) – a dictionary with the slide number as keys, and a dictionary as values. The value dict must contain keys “exprmat” and “metadata”, with should map to matching respective files
add_cols (Iterable | str) – additional columns to visualize (e.g. conditions)
data_dir (str) – optional filepath prefix

Return type:

Returns:

Nothing, updates adata.uns and adata.obs

stampede.pl#

plotting functions

stampede.pl.avg_per_pixel(adata, column, fill_cell_area=True, normalize_cell_area=True, log1p=False, cmap='gist_rainbow', background_color='black', figsize=(20, 15), subplot_kwargs=None, plot_kwargs=None)#

Plot the average values of the given column over all FOVs. Color’s the cell’s center pixel, unless fill_cell_area is set to True (slow).

Parameters:

adata (AnnData) – an adata object
column (str) – a column in adata.obs with numeric values
fill_cell_area (bool) – distribute the column value over all pixels covered by the cell, assuming square cells
normalize_cell_area (bool) – if fill_cell_area is True, normalize the column value over the cell area
log1p (bool) – normalize the final values per pixel?
cmap (tuple[float, float, float] | str | tuple[float, float, float, float] | tuple[tuple[float, float, float] | str, float] | tuple[tuple[float, float, float, float], float]) – colormap (default: “gist_rainbow”)
background_color (tuple[float, float, float] | str | tuple[float, float, float, float] | tuple[tuple[float, float, float] | str, float] | tuple[tuple[float, float, float, float], float]) – color for pixels with 0 values (default: “black”)
figsize (tuple) – figure size
subplot_kwargs (dict) – kwargs passed to plt.subplots
plot_kwargs (dict) – kwargs passed to the main plotting function

Return type:

Returns:

matplotlib figure and array of axes

stampede.pl.column_distribution(adata, column, axis=None, min_quantile=0.0, max_quantile=0.95, subplot_kwargs=None, plot_kwargs=None)#

Plot the distribution of values for a column present in either adata.obs or adata.var.

Parameters:

adata (AnnData) – an adata object.
column (str) – a column in either adata.obs or adata.var
axis (int) – specify if the column name is present in both obs (0) and var (1).
min_quantile (float) – lowest quantile of values to plot
max_quantile (float) – highest quantile of values to plot
subplot_kwargs (dict) – kwargs passed to plt.subplots
plot_kwargs (dict) – kwargs passed to the main plotting function

Return type:

Returns:

matplotlib figure and array of axes

stampede.pl.correlations(adata, xcolumn, ycolumn, log1p_xcolumn=False, log1p_ycolumn=False, color_xcolumn=None, color_ycolumn=None, cmap_2d='Blues', min_quantile=0.0, max_quantile=0.99, bins_1d='auto', bins_2d='auto', stat='percent', figsize=(8, 7), subplot_kwargs=None, plot_kwargs=None)#

Plot the distributions and 2D correlation between two columns in adata.obs.

Parameters:

adata (AnnData) – an adata object
xcolumn (str) – columns in adata.obs to plot on the x-axis
ycolumn (str) – columns in adata.obs to plot on the y-axis
log1p_xcolumn (bool) – normalize the xcolumn?
log1p_ycolumn (bool) – normalize the ycolumn?
color_xcolumn (tuple[float, float, float] | str | tuple[float, float, float, float] | tuple[tuple[float, float, float] | str, float] | tuple[tuple[float, float, float, float], float]) – color of the xcolumn plot
color_ycolumn (tuple[float, float, float] | str | tuple[float, float, float, float] | tuple[tuple[float, float, float] | str, float] | tuple[tuple[float, float, float, float], float]) – color of the ycolumn plot
cmap_2d (tuple[float, float, float] | str | tuple[float, float, float, float] | tuple[tuple[float, float, float] | str, float] | tuple[tuple[float, float, float, float], float]) – colormap of the 2d correlation plot
min_quantile (float) – lowest quantile of values to plot
max_quantile (float) – highest quantile of values to plot
bins_1d (str | int) – number of bins on the 1-dimensional histogram plots
bins_2d (str | int) – number of bins on the 2-dimensional histogram plot
stat (str) – which statistic to plot, see sns.histplot for more details
figsize (tuple) – figure size
subplot_kwargs (dict) – kwargs passed to plt.subplots
plot_kwargs (dict) – kwargs passed to the main plotting function

Return type:

Returns:

matplotlib figure and array of axes

stampede.pl.dim_red(adata, columns, obsm_key='X_svd', cmap='tab10', n_dims=6, subset_size=2000, random_state=42)#

Grid plot visualizing a range of reduced dimensions.

Parameters:

adata (AnnData) – adata object
columns (str | Iterable) – one or more columns in adata.obs to plot. One multiplot per column
obsm_key (str) – key in adata.obsm with dim_red output
cmap (tuple[float, float, float] | str | tuple[float, float, float, float] | tuple[tuple[float, float, float] | str, float] | tuple[tuple[float, float, float, float], float]) – colormap
n_dims (int | tuple) – number of dimensions to plot, or a tuple with dimensions
subset_size (int) – subsample the data to this number (per column)
random_state (int) – random seed value

Return type:

list[tuple[Figure, Axes]]

Returns:

a list of tuples with matplotlib figure and axis

stampede.pl.ncell_per_condition(adata, columns, offset_between_conditions=1, palette='terrain', subplot_kwargs=None, plot_kwargs=None, text_kwargs=None)#

Plot the number of cells per condition in a column in adata.obs.

Parameters:

adata (AnnData) – an adata object
columns (str | list) – one or more columns in adata.obs to visualize, in order of significance
offset_between_conditions (int | list) – distance between different conditions Can be a single value, or a list of offset values for each column (length=len(columns)-1)
palette (tuple[float, float, float] | str | tuple[float, float, float, float] | tuple[tuple[float, float, float] | str, float] | tuple[tuple[float, float, float, float], float] | dict[str, str]) – color palette (default: “terrain”)
subplot_kwargs (dict) – kwargs passed to plt.subplots
plot_kwargs (dict) – kwargs passed to the main plotting function
text_kwargs (dict) – kwargs passed to ax.set_xticks and ax.set_yticks

Return type:

Returns:

matplotlib figure and array of axes

stampede.pl.noise_threshold(adata, bins=50, **kwargs)#

Parameters:

adata (AnnData)
bins (int)

stampede.pl.paired_binomial_glm_volcano(df, symbol_column='index', or_column='odds_ratio', pvalue_column='padj', separation_column='perfect_separation', pval_thresh=0.05, l2or_thresh=0.75, to_label=5, drop_perfect_separation=True, subplot_kwargs=None, plot_kwargs=None, text_kwargs=None)#

Generate a volcano plot from the detection_rates results dataframe.

Parameters:

df (DataFrame) – a dataframe
symbol_column (str) – column name of gene IDs to use
or_column (str) – column name of odds ratios
pvalue_column (str) – column name of the adjusted p values to be converted to -log10 p-values
separation_column (str) – boolean column denoting perfect separations
pval_thresh (float) – threshold pvalue_column for genes to be significant
l2or_thresh (float) – threshold for the log2 odds ratios to be considered significant
to_label (int | list | None) – the number of top genes (down and up each) to be labeled
drop_perfect_separation (bool) – whether to drop the genes with perfect separations
subplot_kwargs (dict) – kwargs passed to plt.subplots
plot_kwargs (dict) – kwargs passed to the main plotting function
text_kwargs (dict) – kwargs passed to ax.text

Return type:

Returns:

matplotlib figure and axis object

stampede.pl.pydeseq2_volcano(df, symbol_column='index', log2fc_column='log2FoldChange', pvalue_column='padj', basemean_column='baseMean', pval_thresh=0.05, log2fc_thresh=0.75, to_label=5, subplot_kwargs=None, plot_kwargs=None, text_kwargs=None)#

Generate a volcano plot from a pyDESeq2 results dataframe.

Adapted from mousepixels/sanbomics

Parameters:

df (DataFrame) – a pyDESeq2 results dataframe
symbol_column (str) – column name of gene IDs to use
log2fc_column (str) – column name of log2 Fold-Change values
pvalue_column (str) – column name of the adjusted p values to be converted to -log10 p-values
basemean_column (str) – column name of base mean values for each gene
pval_thresh (float) – threshold pvalue_column for points to be significant
log2fc_thresh (float) – threshold for the absolute value of the log2 fold change to be considered significant
to_label (int | list | None) – If an int is passed, that number of top down and up genes will be labeled. If a list of gene Ids is passed, only those will be labeled
subplot_kwargs (dict) – kwargs passed to plt.subplots
plot_kwargs (dict) – kwargs passed to the main plotting function
text_kwargs (dict) – kwargs passed to ax.text

Return type:

Returns:

matplotlib figure and axis object

stampede.pl.scree(adata, obsm_key='X_svd')#

Scree plot

Parameters:

adata (AnnData) – adata object
obsm_key (str) – key in adata.obsm with dim_red output

Return type:

Returns:

matplotlib figure and array of axes

stampede.pl.sketch(adata, obs_column='subset', use_rep='X_svd', plot_kwargs=None)#

Scatterplot highlighting the cells that were sampled. Requires the full adata object.

Parameters:

adata (AnnData) – adata object
obs_column (str) – column in adata.obs with boolean values if the cell is kept
use_rep (str) – use the indicated representation
plot_kwargs (dict) – kwargs passed to the main plotting function

Return type:

Returns:

matplotlib figure and array of axes

stampede.pl.slide_qc(adata, columns=None, figsize=None, subplot_kwargs=None, plot_kwargs=None, legend_kwargs=None)#

Plot the values from one or more QC columns in adata.uns[“fov_metadata”] (added by slide_qc_data()). Specify columns to limit the number of plots.

Parameters:

adata (AnnData) – an adata object
columns (str | Iterable) – columns in adata.uns[“fov_metadata”] to plot (default: all)
figsize (tuple) – tuple of figure, will be multiplied by the number of plots
subplot_kwargs (dict) – kwargs passed to plt.subplots
plot_kwargs (dict) – kwargs passed to the main plotting function
legend_kwargs (dict) – kwargs passed to the legends

Return type:

Returns:

matplotlib figure and array of axes

stampede.pl.value_distribution(adata, layer=None, min_quantile=0.0, max_quantile=0.95, subplot_kwargs=None, plot_kwargs=None)#

Plot the number of occurrences of values in the dataset.

Parameters:

adata (AnnData) – an adata object.
layer (str) – the layer the values are drawn from (default: X)
min_quantile (float) – lowest quantile of values to plot
max_quantile (float) – highest quantile of values to plot
subplot_kwargs (dict) – kwargs passed to plt.subplots
plot_kwargs (dict) – kwargs passed to the main plotting function

Return type:

Returns:

matplotlib figure and array of axes

stampede.pl.violin(adata, columns, inner='quart', fill=False, cut=0, log_scale=False, figsize=None, subplot_kwargs=None, plot_kwargs=None)#

Violin plots for one or more columns in adata.obs.

Wraps seaborn’s violinplot. See https://seaborn.pydata.org/generated/seaborn.violinplot.html

Parameters:

adata (AnnData) – an adata object
columns (str | list) – one or more column in adata.obs
inner (str) – See sns.violinplot for more details
fill (bool) – See sns.violinplot for more details
cut (int) – See sns.violinplot for more details
log_scale (bool | Sequence[bool]) – See sns.violinplot for more details
figsize (tuple) – tuple of figure, will be multiplied by the number of plots
subplot_kwargs (dict) – kwargs passed to plt.subplots
plot_kwargs (dict) – kwargs passed to the main plotting function

Return type: