Data analysis of transcriptional changes

TW Thomas P. Wytock
AF Aretha Fiebig
JW Jonathan W. Willett
JH Julien Herrou
AF Aleksandra Fergin
AM Adilson E. Motter
SC Sean Crosson
request Request a Protocol
ask Ask a question
Favorite

To categorize genetic responses as restorative or compensatory adaptive evolution (R-AE and C-AE, respectively), we first used Venn diagrams to represent the R-AE responses. The transcription was studied through comparing changes (statistically significant with a fold change > 1.5) between the primary deletion and the WT strain, the primary deletion and an associated sup strain, and the WT strain and the same sup strain. In this set of comparisons, genes that possess statistically significant changes from the primary deletion (expression in the WT strain vs. the primary deletion strain) and from the adaptive evolution (expression in the primary deletion strain vs. the sup strain) only exhibit R-AE, while genes that possess statistically significant changes in all three cases, or the pairs of cases including the comparison of the expression in the WT vs. the sup strain as a member, exhibit C-AE (Fig 5B, S3S7 Tables). For each primary deletion strain and its associated sup strains, we also compared the transcriptional state of the primary deletion and all sup strains with that of the WT strain (Fig 5C, S3S7 Tables). We identified the set of genes that show a change between the primary deletion and the WT strain but not with any sup strain. All genes in this set exhibit R-AE, since the transcription rates of the adaptively evolved strains are in each case statistically indistinguishable from the WT strain. Meanwhile, genes demonstrating C-AE are contained in the set of genes that show differences between the WT expression level and that of all sup strains.

We used Gene Change Score (GCS) to more easily represent both R-AE and C-AE responses (as described in Fig 5). The GCS is a sum of Boolean variables P + Q + R + S + T + U, where each variable is 1 if true and 0 if false. The variables are defined as follows: P indicates whether the gene expression change resulting from the gene deletion is statistically significant; Q indicates whether the log fold change resulting from the gene deletion is larger than threshold; R indicates whether the gene expression change resulting from the adaptive evolution is statistically significant; S indicates whether the log fold change resulting from the adaptive evolution is larger than threshold; T indicates whether the log fold changes have equal magnitude and opposite signs; and U indicates whether the log fold changes have the same sign and the post-adaptation expression level is statistically significantly different from the WT expression level. Noting that only one of T or U can be true, the GCS is divided by 5 to normalize scores to be between zero and one. To highlight the difference between restorative and compensatory changes, the GCS is defined to be positive if the log fold changes are opposite in sign and negative otherwise.

To reduce sensitivity of the GCS to the specific threshold values, we calculated the scores from thresholds in a 5-by-5 grid of log2 fold change versus statistical significance (p-value). The grid is spaced evenly on the interval of the 95th–99th percentiles for both fold change and statistical significance (the percentiles are recalculated in for each initial deletion strain), and we averaged the scores over these 25 threshold combinations. Since C-AE and R-AE are mutually exclusive, we represented all scores on the same axis in Fig 6A by making the C-AE scores negative and R-AE scores positive with non-significant changes being scored as zero.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A