To identify biomarker potential in localized cancer settings, we investigated the MiTranscriptome dataset, which is a large-scale ab initio transcriptome meta-assembly from 10,225 RNA-Seq libraries derived from 36 distinct malignant tissues44. Raw reads per gene, gene, and transcript annotation and sample metadata were combined and converted using Summarized Experiment R package to ease downstream analysis in R. The MiTranscriptome dataset was filtered to only contain tumor subtypes in which ≥10 normal and ≥10 malignant tissues were available on which differential analysis could be performed, this heuristic filtering left 12 out of 70 distinct tumor subtypes for downstream analysis (lung, kidney, colorectal, stomach, breast, head/neck, uterus, liver, bladder/urothelial, thyroid, prostate, and esophagus).
Differential gene analysis between healthy and malignant tissue, per tissue, was performed using DESeq2 (v1.22.2)44 with the Wald test and Benjamini–Hochberg correction on all MiTranscriptome transcripts overlapping hCRISPRs with a minimum of five base pairs (n = 7750). Significant results of the differential analysis were obtained with the following criteria: |logFC | ≥ 1 and adjusted p-value (BH) ≤ 0.01 and an average read-count, over all samples, of at least 10. In total 3850 distinct MiTranscriptome transcripts overlapping hCRISPRs were observed over all tissues.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.