The software implementation of the method in Deciphering more accurate cell-cell interactions by modeling cells and their interactions.
- numpy, pandas==1.5.2
- scipy, scanpy, umap
- loess
- smurf-imputation
pip install pyStrint
Expand section
-
Spatial Transcriptomics (ST) Count Data
st_exp
dataframe with spots as rows and genes as columns
-
Spatial coordinates
st_coord
dataframe with spot as rows, axis x and y as columns
-
Cell-type deconvoluted spatial matrix
st_decon
dataframe with spot as rows and cell-type as columns
-
Single-cell RNA-seq Count Data
sc_exp
dataframe with cells as rows and genes as columns
-
Single-cell RNA-seq Metadata
sc_meta
dataframe with cells as rows and cell types as columnscell_type_key
column name of the celltype identity insc_meta
-
Single-cell RNA-seq distribution Data
sc_distribution
dataframe with cells as rows and genes as columns
-
Ligand and Receptor Data (optional)
lr_df
user provided dataframe with ligand-receptor pairs as rows, ligand, receptor and its weight as columns
Convert to adata format
sc_adata, st_adata, sc_distribution, lr_df = pp.prep_adata(sc_exp = sc_exp, st_exp = st_exp, sc_distribution = sc_smurf,
sc_meta = sc_meta, st_coord = st_coord, SP = species)
Expand section
-
Spatial Transcriptomics (ST) Count Data
st_adata
adata.X with spots as rows and genes as columnsst_adata.obs
dataframe with spot as rows, spot coordinates x and y as columns
-
Cell-type deconvoluted spatial matrix
st_decon
dataframe with spot as rows and cell-type as columns
-
Single-cell RNA-seq Count Data
sc_adata
adata.X dataframe with cells as rows and genes as columnssc_adata.obs
dataframe with cells as rows and cell types as columns
-
Single-cell RNA-seq distribution Data
sc_distribution
dataframe with cells as rows and genes as columns
Expand section
obj = spamint.spaMint(save_path = outDir, st_adata = st_adata, weight = st_decon,
sc_distribution = sc_distribution, sc_adata = sc_adata, cell_type_key = 'celltype',
st_tp = st_tp)
obj.prep()
-
save_path
Output Dir to save results -
st_adata
adata.X Spatial Transcriptomics (ST) Count Data with spots as rows and genes as columnsst_adata.obs
dataframe with spot as rows, spot coordinates x and y as columns
-
weight
Cell-type deconvoluted spatial dataframe with spot as rows and cell-type as columns -
sc_distribution
Single-cell RNA-seq distribution dataframe with cells as rows and genes as columns -
sc_adata
adata.X Single-cell RNA-seq Count dataframe with cells as rows and genes as columnssc_adata.obs
dataframe with cells as rows and cell types as columns
-
cell_type_key
cell type colname in sc_adata.obs -
st_tp
ST sequencing platform choose from st (ST legacy), visium (10X Visium), or slide-seq (Any single-cell resolution data)
Expand section
sc_agg_meta = select_cells(self, p = 0.1, mean_num_per_spot = 10, max_rep = 3, repeat_penalty = 10)
p
percentage of the interface similarity during cell selectionmean_num_per_spot
Average number of cells per spot.max_rep
Maximum number of repetitions for cell selection.repeat_penalty
When one cell has been picked for [THIS] many times, its probability of being picked again decreases by half. Recommanded to be near (st_exp.shape[0]*num_per_spot/sc_exp.shape[0]) * 10
Expand section
refine_sc_exp, sc_agg_meta = gradient_descent(self, alpha = 1, beta = 0.001, gamma = 0.001,
delta = 0.1, eta = 0.0005,
init_sc_embed = None,
iteration = 20, k = 2, W_HVG = 2,
left_range = 0, right_range = 8, steps = 1, dim = 2)
-
alpha, beta, gamma, delta
Hyperparameters for the loss function.alpha: the weight of the term that maintains the expression similarity between cells and their respective gamma distribution models, default: 1.
beta: the weight of adjusting cell locations based on cell-cell affinity.
gamma: the weight of optimizing interface profile similarity between pseudo-spots and their corresponding ST spots, default: 0.001.
delta: the weight of the regularization term.
-
eta
float, default: 0.0005Learning rate for gradient descent.
-
init_sc_embed
DataFrame, optional, default: NoneInitial embedding for single-cell data.
-
iteration
int, optional, default: 20The number of iterations for optimization.
-
k
int, optional, default: 2The number of neighbors in each adjacent spot.
-
W_HVG
int, optional, default: 2Weight for highly variable genes.
-
left_range
int, optional, default: 0 -
right_range
int, optional, default: 8The index range for the neighbor number in the embedding process, the actual neighbor number is (i+1)*10
-
steps
int, optional, default: 1The iteration number for each neighbor
-
dim
int, optional, default: 2The embedding dimension of the reconstruction
More details in demo_tutorial.ipynb
tutorial file can be downloaded at: https://drive.google.com/drive/folders/1FYa4hzg3vVo6y2BOzlJbXhPTmdEcjD4O?usp=sharing