sparank.data.find_sc_markers

sparank.data.find_sc_markers(adata, celltype_key, batch_key=None, layer='log1p', deg_method='wilcoxon', log2fc_min=0.5, pval_cutoff=0.01, n_top_markers=200, pct_diff=None, pct_min=0.1)[source]

Batch-aware marker gene detection using scanpy’s rank_genes_groups.

When batch_key is given, differential expression is run independently in each batch and the union of per-batch marker sets is returned.

Parameters:
  • adata (AnnData) – Annotated single-cell reference dataset.

  • celltype_key (str) – Column in adata.obs storing cell-type labels.

  • batch_key (str, optional) – Optional column for batch-aware DE. None indicates global mode.

  • layer (str, default "log1p") – Layer in adata used as expression input.

  • deg_method (str, default "wilcoxon") – Statistical method forwarded to sc.tl.rank_genes_groups.

  • log2fc_min (float, default 0.5) – Minimum log2 fold-change threshold.

  • pval_cutoff (float, default 0.01) – Adjusted p-value cutoff.

  • n_top_markers (int, default 200) – Maximum number of markers retained per cell type per batch.

  • pct_diff (float, optional) – If set, additionally filter by (pct_group pct_rest) > pct_diff.

  • pct_min (float, default 0.1) – Minimum fraction of cells in the group expressing the gene.

Returns:

Sorted array of unique marker gene names across all batches.

Return type:

np.ndarray