The neural network in GASTON learns an isodepth that smoothly varies across a tissue slice ; however, the scaling of the learned isodepth is arbitrary. To improve the interpretability of the isodepth learned by the neural network, we scale the isodepth in each spatial domain to reflect approximate physical distances inside the domain. Briefly, we derive an estimate of the “average width” of each spatial domain in μm, and we linearly transform the isodepth in each spatial domain such that the range of isodepth values in domain is .
We scale the isodepth in each spatial domain as follows. Given the isodepth , spatial domains , and breakpoints estimated from (10) and (11), we assume without loss of generality that the isodepth is linearly transformed such that and , i.e. the breakpoints satisfy , where we set and for convenience. For each spatial domain , let be the average width of the domain, whose computation we describe below. We compute the “scaled” isodepth as
where are chosen such that is continuous, and if for . With this choice of , the range of scaled isodepth values in a spatial domain is given by
That is, the range of isodepth values in each spatial domain is the average width of the domain .
We estimate the average width of each spatial domain by computing the median physical distance between the two boundaries of the domain . Specifically, let and let be the set of spatial locations on the lower and upper boundary curves of the spatial domain , respectively. We set to be the median distance between each spot and the closest spot in We choose such that and visually correspond to the spatial domain boundaries.
For 10x Genomics Visium data, we multiply each average width by 100, since the physical distance between the centers of adjacent spots in the 10x Visium slide is 100μm. For Slide-seqV2 data, we multiply each average width by 64/100, since two beads that are 100 pixels apart in the Slide-SeqV2 microscopy image have a physical distance of roughly 64μm [116].
To simplify the visualization of the 1-D expression functions , we aggregate the counts for spots with approximately equal isodepth values , as in [83]. Specifically, we partition the range of isodepth values into a union of intervals , and we compute the total expression value for gene in each interval . We call the pooled expression value of gene at pooled spot . Pooling does not affect inference of the 1-D expression function in the STP, as the function obtained by maximizing the log-likelihood (9) with pooled data is equal to the function obtained by maximizing (9) with the original data, as shown in [83].
We plot expression as log pooled counts per million (CPM) , where is the sum of the total UMI counts across all spots in the jth pooled spot. The log pooled CPM has approximately the same scale as the expression function for each gene .
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.