Single base mutations were called from the RNA-seq data using the GATK pipeline. Indels were not considered here due to technical issues that may arise from calling this from RNA-Seq data. Briefly, STAR v2.7 was used to align the raw RNA reads to the hg38 human genome assembly and PICARD tools were used to mark duplicates. GATK tools, SplitNCigarReads, BaseRecalibrator, and ApplyBQSR were applied in order to reformat some of the alignments that span introns and correct the base quality score. Finally, the HaplotypeCaller software was used to call variants. The resulting VCF files were annotated using SnpEff followed by filtration for possible impact on proteins. First, only SNVs annotated with a HIGH or MODERATE impact by SnpEff were included and SNVs in splice-site genomic locations were excluded. Second, mutations with an rs ID in dbSNP were excluded. Third, only mutations with a quality score above 100 and a Fisher Strand score (FS) below 30.0 were included. Finally, mutations called in ten or more samples were filtered out with the exception of known mutation hotspots (FGFR3 and PIK3CA). When calculating the RNA-derived mutational load, we excluded mutations that were significantly found more often in samples sequenced on an Illumina NovaSeq 6000 (new additional RNA-Seq data) compared to samples sequenced on an Illumina HiSeq 2000 (original RNA-Seq data) (Fisher’s exact test p-values < 0.01), as the samples sequenced using the NovaSeq platform contained considerable more reads. Thereby, a total of 791 genes were considered for the RNA-derived mutational load. Furthermore, we validated RNA-derived mutations in DNA for a subset of patients (n = 38) where whole-exome sequencing data was available. Mutations with > 10 reads in tumor and germline DNA were considered and a mutation was called observed when the frequency of the alternate allele was above 2%.

Note: The content above has been extracted from a research article, so it may not display correctly.

Please log in to submit your questions online.
Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.

We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.