request Request a Protocol
ask Ask a question
Favorite

SpaOTsc constructs a mapping between the n cells in scRNA-seq data and the m positions in spatial data by solving an optimal transport problem20 given three dissimilarity/distance matrices, MRn×m for the gene expression dissimilarity between cells and locations, DscRn×n for the gene expression dissimilarity among cells, and DspaRm×m for the distances among spatial locations. The optimal transport plan γ* is obtained by solving

where ω1,ω2 are weight vectors and L measures the difference between scaled dissimilarities/distances. The first term quantifies the major transport cost, the second penalty term promotes weight conservation (unbalanced transport)21, and the last term preserves the distance within datasets through the mapping (structured transport)22. The spatial cell–cell distance D^sc is then computed based on γ* using the optimal transport distance:

One can carry out three major tasks immediately after obtaining γ* and D^sc: (1) prediction of spatial gene expression at the ith position by jγj,i*gj/jγj,i* where gRn is the expression vector for a gene in scRNA-seq data; (2) identification of spatially localized cell subclusters by distance-based clustering using D^sc within each previously identified cluster; and (3) visualization of scRNA-seq data constrained by cell–cell distances using the distance matrix D^sc.

The intercellular gene–gene regulatory information flow is inferred by using partial information decomposition28,29,34. We estimate how much unique information about a gene (target gene) can be provided by another gene (source gene) in its spatial neighborhood through the calculation of the accumulated unique information:

where Gtar is the variable for target gene expression in the cells, G~src is the variable for source gene expression in η-neighborhoods of cells whose observation is estimated using D^sc, and G is a collection of genes with high intracellular correlation with the target gene. The unique information UnqX(Z;Y) measures how much unique information Y provides about Z in addition to X.

For the case of intercellular signaling with known ligands, receptors, and their downstream genes, we use random forest models58,59 to infer the spatial distance of signaling. The ligand expressions of cells in a neighborhood of distance of η, denoted as L~η, together with other genes highly correlated to a downstream target gene of the ligand–receptor interaction are used as features to fit a random forest model outputting the target gene. The receptor expressions are used as sample weights. The η under which L~η has the highest feature importance is considered to be the spatial distance of this signaling.

Knowing the ligands, receptors and downstream genes involved in intercellular signaling and D^sc, we then infer cell–cell communication by solving another optimal transport problem. First, the source distribution over the cells ωL is constructed to be proportional to the expression of ligand gene. Next a destination distribution ωD is constructed based on the expression of receptors and downstream genes to represent the probability of a cell to receive the signal. A cell highly expressing receptors with downstream genes consistent with the up-/down-regulation relationships (low expression of down-regulated genes and high expression of up-regulated genes) is assigned with a high probability. With this information we solve the following optimal transport problem

The optimal transport plan γS* is interpreted as likelihood of cell–cell communications, e.g. its ijth element describes how likely cell j receives signal from cell i. When spatial distances for signaling are available, we can simply adjust the cost matrix D^sc by setting entries greater than this distance to a large number to enforce a spatial constraint on communications identification. When a spatial constraint is applied, long-distance connections will be eliminated and new short connections may emerge (Supplementary Figs. 21, 22).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

post Post a Question
0 Q&A