Computational procedures

MH Mor Hanan
AS Alon Simchovitz
NY Nadav Yayon
SV Shani Vaknine
RC Roni Cohen‐Fultheim
MK Miriam Karmon
NM Nimrod Madrer
TR Talia Miriam Rohrlich
MM Moria Maman
EB Estelle R Bennett
DG David S Greenberg
EM Eran Meshorer
EL Erez Y Levanon
HS Hermona Soreq
SK Sebastian Kadener
request Request a Protocol
ask Ask a question
Favorite

To detect circRNA and mRNA transcripts, we constructed RNA‐seq libraries by using the rRNA depletion method, enabling simultaneous detection and profiling by sequencing of both circular RNA forms and linear mRNA transcripts in these samples (Fig EV1C and D). All sequences were also analyzed for mRNA expression by the use of bowtie2 and STAR in order to accurately align them to the transcriptome. In the pooled libraries, rRNA abundance ranged between 8–12% and the alignment rate of the remaining transcripts to the genome (w/o rRNA) ranged between 76–82%. We then used a dedicated bioinformatics pipeline to detect and annotate circRNAs (Memczak et al, 2013; Pamudurti et al, 2017). Reads supporting particular head‐to‐tail junctions were used as an absolute measure of circRNAs abundance and were normalized using the DESeq2 algorithm in R. This analysis identified thousands of circRNAs, with part of those differentially expressed between PD patients and control volunteers. For DESeq2 normalization, we normalized the numbers of total circRNAs detected by adding all mapped reads from the STAR alignment (mRNAs, lncRNAs, etc.) according to the aligned reads which were detected and quantified in each sample. Additionally, we used the Sailfish pipeline to achieve alignment‐free isoform quantification from RNA‐seq reads using lightweight algorithms (Langmead & Salzberg, 2012; Kim et al, 2013; Patro et al, 2014)

The resultant libraries were sequenced deeply (~50 M reads per sample on average), allowing reliable detection and quantification of mRNAs, non‐coding RNAs and especially circRNAs. The large numbers of sequenced samples further assisted in dealing with the individual heterogeneity characteristic of human samples and especially of diseased brain tissues. Specifically, mRNA reads were normalized, and differentially expressed transcripts were analyzed using the DEseq2 algorithms in R. Dataset EV2 details the numbers of reads in each pool of samples and of circRNAs detected within each pool of samples, and the number of reads in each sample. Generally, similar numbers of circRNAs were detected (roughly 12K circRNAs among 50 M sequenced reads in each sample). We excluded all samples with < 5,000 detected circRNA reads in the analysis, as detailed in Fig EV1E–G, excluded samples included MTG2, MTG10, SN1, SN16, and SN19. A total of 38,860 transcripts could be quantified: 36,311 mRNAs and 2,549 circRNAs. For quantification of circRNA abundance in each tissue and condition, circRNAs were included in the analysis only if they were expressed > 5 or > 10 reads per tissue. For differential expression and GO analysis, we corrected for cell composition according to three cell markers, including transmembrane protein 119 (TMEM119) for microglia, aldehyde dehydrogenase 1 family member L1 (ALDH1L1) for astrocytes, and synaptotagmin 1 (Syt1) for neurons using EdgeR package in R (Robinson et al, 2010; McCarthy et al, 2012). Enriched GO terms were analyzed using DAVID Bioinformatics Resources (da Huang et al, 2009).

Sequencing samples were only analyzed if number of circRNA reads were > 5,000. RNA quality was pre‐established. Preparation of RNA libraries was performed with samples unmarked as PD or control. RNA sequencing data were checked for variance and normalized by the use of EdgeR and DEseq2 packages in R.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A