CRISPRi gene essentiality predictions were made using a modified version of the resampling approach previously utilized for TnSeq gene essentiality predictions in M. tuberculosis (DeJesus et al., 2017). Briefly, read-counts at 24.3 generations were compared ± ATc. Read-counts were normalized in two steps as described in Quantification of sgRNA depletion; first to account for sequencing depth (using TTR), and then to make use of the control sgRNAs. For each gene, normalized counts were permuted across the +ATc and –ATc conditions at 24.3 generations for a total of 20,000 iterations. While permutation tests typically look for differences in mean counts, the presence of sgRNAs of different strengths can disproportionately affect the mean of a given gene (e.g., a gene targeted with many weak and few strong sgRNAs). This made differences at lower percentiles the more relevant test-statistic, as it would be more sensitive to the presence of just a few strong guides. Thus, at each iteration, , the difference in the 20th percentile between the counts was estimated:
where is the percentile function, represents normalized counts for gene at a given condition. The 20,000 instances of the test-statistic estimated after all iterations represented the distribution of the test-statistic under the null-hypothesis. A p-value was estimated by comparing the observed value of the test-statistic to the distribution of the null-hypothesis. p-values were adjusted for multiple comparisons using the Benjamini-Hochberg procedure (Benjamini and Hochberg, 1995). A p-value threshold of was used to assess statistical significance.
For each gene, a summary L2FC was estimated to assess the biological significance of the effect size. L2FC was summarized as the median value of the strongest 10 sgRNAs (i.e., sgRNAs with the smallest L2FC). The optimal threshold for the L2FC cutoff was determined by optimizing the F1-score of the CRISPRi essentiality predictions obtained by varying L2FC thresholds and comparing these against the TnSeq predictions of essentiality. The optimal threshold was estimated at L2FC at 24.3 generations. Genes exceeding both thresholds (i.e., L2FC and ) were called as CRISPRi essential genes by our methodology.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.