Dimensionality reduction and visualization

DS Dongyuan Song
KL Kexin Li
XG Xinzhou Ge
JL Jingyi Jessica Li
ask Ask a question
Favorite

To visualize the high-dimensional single-cell data, we first applied the PF-logPF transformation to a cell-by-gene count matrix [36]. We then used the R package irlba (version 2.3.5.1) to calculate the top 50 principal components (PCs) of the transformed matrix. Next, we used the R package umap (version 0.2.10.0) to project the cells from the 50-dimensional PC space to the 2-dimensional UMAP space.

When comparing the target data and the synthetic null data, we calculated the PCs and UMAPs jointly by concatenating the two datasets so the target cells and synthetic null cells were projected to the same 2-dimensional UMAP space.

We used the R package ggplot2 (version 3.4.2) to make all plots.

For the UMAP visualizations in Fig. 2f, we truncated each gene’s normalized expression levels to be below the 99-th percentile to better visualize the gene expression pattern.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A