Permutation-Based Fold Change Test

XH Xuelin He
LL Li Liu
BC Baode Chen
CW Chao Wu
request Request a Protocol
ask Ask a question
Favorite

Here, we describe a simple method named CTSFinder, which can identify the different cell types between case and control samples.

At first, we conducted differential gene expression analysis between the case and control samples. In the simulated bulk RNA-Seq data, we input the processed read files to DESeq2 (Love et al., 2014) and set the mode as “moderated log2 fold changes” to calculate the log2-transformed fold-change (log⁡2(FC)) value of each gene between samples. We downloaded raw read files pertaining to bulk RNA-Seq data from 17 organs and then used DESeq2 (Love et al., 2014), setting the mode as “moderated log2 fold changes” to calculate the log2-transformed fold-change (log⁡2(FC)) value of each gene between samples. In the bulk RNA-Seq data for the in vivo and in vitro developing mouse retina, we downloaded the CPM (counts per million reads mapped) value. In the other bulk RNA-Seq data, we downloaded the FPKM (fragments per kilobase of exon model per million reads mapped) value. We calculated its median values in the case samples and the control samples for each gene. Then, for each gene, we selected the large one between 1 and its median value in the case samples and the large one between 1 and its median value in the control samples and calculated the log2-transformed fold-change (log⁡2(FC)) value with the two values.

Then, we filtered out the genes with log⁡2(FC) equal to zero. We counted the sequenced genes in each of the 46 CTS gene clusters and selected those clusters with 10 or more expressed genes to conduct further analysis.

Third, for a gene cluster, we calculated the median of log⁡2(FC)value of its genes as median(log⁡2(FC)all). Then, we shuffled the log⁡2(FC) value of all expressed genes 10,000 times and calculated the median(log⁡2(FC)all) of the gene cluster as the median(log2(FC)perm) at each time to obtain a median(log2(FC)perm) set. Next, we calculated the frequency of the value in median(log2(FC)perm)set equal to or higher than median(log2(FC)all) as p value if median(log2(FC)all)≥0. We calculated the frequency of the value in the median(log2(FC)perm)set equal to or lower than median(log2(FC)all) as p value if median(log2(FC)all) < 0. We calculated median(log2(FC)all) and p value for each gene cluster in this way.

Finally, we identified the significant gene clusters with median(log2(FC)all) and p value. We identified the significantly up-regulated gene clusters in bulk simulated RNA-Seq data and bulk organ RNA-Seq data with median(log2(FC)all) > 1 and p < 0.001. We identified the significantly up- or down-regulated gene clusters in the mouse developing liver RNA-Seq data with median(log2(FC)all) > 1ormedian(log2(FC)all) < −1 and p < 0.001. We identified the significantly up-regulated gene clusters in giNPC data and iPS cell data with median(log2(FC)all) > 1 and p < 0.001. We identified the significantly up-regulated gene clusters in the in vivo and in vitro developing mouse retina data with median(log2(FC)all) > 1 and p < 0.001.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

post Post a Question
0 Q&A