Data visualization, clustering and diffusion maps

Ane Iturbide; Mayra L. Ruiz Tejeda Segura; Camille Noll; Kenji Schorpp; Ina Rothenaigner; Elias R. Ruiz-Morales; Gabriele Lubatti; Ahmed Agami; Kamyar Hadian; Antonio Scialdone; Maria-Elena Torres-Padilla

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Data visualization, clustering and diffusion maps

AI Ane Iturbide

MS Mayra L. Ruiz Tejeda Segura

CN Camille Noll

KS Kenji Schorpp

IR Ina Rothenaigner

ER Elias R. Ruiz-Morales

GL Gabriele Lubatti

AA Ahmed Agami

KH Kamyar Hadian

AS Antonio Scialdone

MT Maria-Elena Torres-Padilla

This method is extracted from research article: Nat Struct Mol Biol, May 2021

Retinoic acid signaling is critical during the totipotency window in early mammalian development

DOI: 10.1038/s41594-021-00590-w

Request a Protocol

Ask a question

Favorite

We used UMAP⁶⁰ for data visualization (‘umap’ function in scanpy, with options n_components=2, min_dist=1). Leiden clustering was performed on the top 3,000 HVGs calculated across the whole dataset (with k = 15 and resolution = 0.4) using a correlation distance in the ‘pp.neighbors’ function from scanpy. To identify marker genes for a given cluster, first we found differentially expressed genes between that cluster and any other cluster (Wilcoxon’s rank sum test, false discovery rate (FDR) < 0.1, log₂FC > 1), then genes were ranked according to their mean FDRs computed across all pairwise comparisons. To validate the differentiation state of the clusters suggested by the markers, the expression of some previously known relevant genes (Rex1, Sox2, Nanog, Tcstv1, Zscan4a, Zscan4c, Zscan4d, Zscan4e, Gata6, Meis1, Sox17 and Sox7) was plotted on UMAP. Cells were aligned along a pseudotime trajectory using a diffusion map⁶¹, which was computed with the ‘diffmap’ function from the scanpy package on the first 20 principal components. We performed all differential gene expression analyses with Wilcoxon’s rank sum test, with an FDR threshold of 0.1 and log₂FC threshold of 1.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol