Cut-off based on the number of expressed genes (nGene) and percentage of mitochondrial genes (percent.mito) were used to filter out the low-quality cells. Low-quality cells from Smart-Seq2 datasets were filtered out by having less than 2000 nGene or percent.mito higher than 0.125. scRNA-seq data were filtered by using the same cut-off reported previously (around 3200 <nGene <6400, percent.mito <0.06)12. Cells belong to hemogenic endothelial progenitors and erythroblasts from Carnegie Stage 7 were excluded in the following analysis. After quality control and excluding mitochondria genes, we focused on genes with one or more counts in at least five cells (assessed for each dataset separately) and calculated log-normalized counts using the deconvolution strategy implemented by the computeSumFactors function in R scran package (v.1.14.6)34 and followed by rescaled normalization using the multiBatchNorm function in the R batchelor package (v.1.2.4)35. So, the size factors were comparable across batches. Log-normalized expression after rescaling was further used in integration and marker-gene detection.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.