To analyze all the samples together, the sample files were pooled together using the 10X Genomic’s Cellranger aggr pipeline to be analyzed via Cell Loupe Browser 5.0. The Cellranger aggr pipeline automatically equalizes the average read depth per cell between groups before merging. This approach avoids artifacts that may be introduced due to differences in sequencing depth. The gene expression from these data were filtered, normalized and clustered using Cell Loupe Browser. Briefly, in order to reduce the gene expression matrix, Cell Ranger performs Principal Component Analysis (PCA) to reduce the dimensionality of the dataset through num_principal_comps that uses a Python implementation of the IRLBA algorithm [51] and visualized using t-distributed Stochastic Neighbor Embedding of principal components (t-SNE). To robustly and confidently cluster the cells, we decided to cluster the cells using specific and verified gene markers for each cell-type (Section Gene-based identification of specific cell clusters).
Once the clusters were determined, Cell Ranger uses the sSeq method to find differentially expressed genes between clusters [52]. When the counts become large, Cell Ranger changes to an asymptomatic beta test used in edgeR [53]. For each cluster, the algorithm is run on that cluster versus all other cells, yielding a list a genes that are differentially expressed in that cluster compared to the rest of the sample. Instead of the sSeq’s implementation, Cell Ranger computes relative library size as the total UMI counts for each cell divided by the median UMI counts per cell. Similar to sSeq, normalization is performed by a per-cell library-size parameter that is incorporated as a factor in the exact-test probability calculations. To filter out multiplets, low quality cells, and empty droplets, filtering was performed as follows: UMIs were normalized to include counts between approximately 200 and 40,000, features included were in the range of 25–6000, the mitochondrial fraction percentage of UMIs per barcode associated with mitochondrial genes was set at 50%. The mitochondrial percentage is set higher than standard due to the higher mitochondrial gene content of hepatocytes [14] and 10 principal components were applied (default). This removed 887 (3% of barcodes (0.1–4.1% of each cell type)).
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.