sparank.data.find_sc_markers
- sparank.data.find_sc_markers(adata, celltype_key, batch_key=None, layer='log1p', deg_method='wilcoxon', log2fc_min=0.5, pval_cutoff=0.01, n_top_markers=200, pct_diff=None, pct_min=0.1)[source]
Batch-aware marker gene detection using scanpy’s rank_genes_groups.
When batch_key is given, differential expression is run independently in each batch and the union of per-batch marker sets is returned.
- Parameters:
adata (AnnData) – Annotated single-cell reference dataset.
celltype_key (str) – Column in
adata.obsstoring cell-type labels.batch_key (str, optional) – Optional column for batch-aware DE.
Noneindicates global mode.layer (str, default "log1p") – Layer in adata used as expression input.
deg_method (str, default "wilcoxon") – Statistical method forwarded to
sc.tl.rank_genes_groups.log2fc_min (float, default 0.5) – Minimum log2 fold-change threshold.
pval_cutoff (float, default 0.01) – Adjusted p-value cutoff.
n_top_markers (int, default 200) – Maximum number of markers retained per cell type per batch.
pct_diff (float, optional) – If set, additionally filter by
(pct_group − pct_rest) > pct_diff.pct_min (float, default 0.1) – Minimum fraction of cells in the group expressing the gene.
- Returns:
Sorted array of unique marker gene names across all batches.
- Return type:
np.ndarray