In this study, we used R version 3.6.1 for analysis. The GSE133054 dataset is the expression matrix of raw gene counts, and the voom function in “limma” package was used to normalize the dataset.16 The downloaded GSE141910 dataset was also normalized using the voom function in limma. The GSE36961 dataset was normalized by Fastlo normalization and log2 transformation.17
We merged the GSE133054 and GSE141910 datasets and used the “sva” package to correct the batch effect.18 Two-dimensional principal component analysis (PCA) cluster plots were used to show the sample distribution before and after correction. The merged dataset was used for further analysis. DEGs were screened using the “limma” package,16 the cutoff criteria were set as adjusted P < 0.05 and |log2-fold change (log2 FC)| ≥ 1. The volcano map and heatmap were used to demonstrate the differential expression of DEGs.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.