Quality control and normalization

SE Sajedeh Nasr Esfahani
YZ Yi Zheng
AA Auriana Arabpour
AI Agnes M. Resto Irizarry
NK Norio Kobayashi
XX Xufeng Xue
YS Yue Shao
CZ Cheng Zhao
NA Nicole L. Agranonik
MS Megan Sparrow
TH Timothy J. Hunt
JF Jared Faith
ML Mary Jasmine Lara
QW Qiu Ya Wu
SS Sherman Silber
SP Sophie Petropoulos
RY Ran Yang
KC Kenneth R. Chien
AC Amander T. Clark
JF Jianping Fu
request Request a Protocol
ask Ask a question
Favorite

Cut-off based on the number of expressed genes (nGene) and percentage of mitochondrial genes (percent.mito) were used to filter out the low-quality cells. Low-quality cells from Smart-Seq2 datasets were filtered out by having less than 2000 nGene or percent.mito higher than 0.125. scRNA-seq data were filtered by using the same cut-off reported previously (around 3200 <nGene <6400, percent.mito <0.06)12. Cells belong to hemogenic endothelial progenitors and erythroblasts from Carnegie Stage 7 were excluded in the following analysis. After quality control and excluding mitochondria genes, we focused on genes with one or more counts in at least five cells (assessed for each dataset separately) and calculated log-normalized counts using the deconvolution strategy implemented by the computeSumFactors function in R scran package (v.1.14.6)34 and followed by rescaled normalization using the multiBatchNorm function in the R batchelor package (v.1.2.4)35. So, the size factors were comparable across batches. Log-normalized expression after rescaling was further used in integration and marker-gene detection.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A