Step1: normalize scRNA-seq data
- size factor normalization to 10,000 reads per cell
- log(x+1) transformation
Step 2: differential expression analysis using MAST https://genomebiology.biomedcentral.com/articles/10.1186/s13059-015-0844-5
- DGE analysis done for each tissue-cell type separately, treating all cells in the tissue-cell type as samples.
- Regression: normalized expression of a given gene ∼ age + sex
- P-value based on the "H" component, coefficient and z-score based on the "logFC" component.
R code calling MAST
zlmCond <- zlm(formula = as.formula(paste0("~age_num", covariate)), sca=sca_filt)
summaryCond <- summary(zlmCond, doLRT="age_num")
R code summarizing MAST results:
summaryDt <- summaryCond$datatable
dt1 = summaryDt[contrast=="age_num" & component=="H", .(primerid, `Pr(>Chisq)`)]
dt2 = summaryDt[contrast=="age_num" & component=="logFC", .(primerid, coef, z)]
de_res = merge(dt1, dt2, by="primerid")
colnames(de_res) <- c("gene", "age.H_p", "age.logFC", 'age.logFC_z')
de_res$age.H_fdr <- p.adjust(de_res$age.H_p, "fdr")
R script: https://github.com/czbiohub/tabula-muris-senis/blob/master/2_aging_signature/job.DGE_analysis/DGE_analysis.R
Other code for DGE analysis: https://github.com/czbiohub/tabula-muris-senis/tree/master/2_aging_signature/job.DGE_analysis
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this
article to respond.