First, quality control was performed on raw sequencing reads by the open software FastQC (www.bioinformatics.babraham.ac.uk/projects/fastqc/). Then, adaptors were filtered by cutadapt (34) with at least 15 nt remaining length of paired-end reads. The clean reads were mapped to the reference genome (hg19) using HISAT2 (35) with default parameters, except reporting only one result. Integrative Genomics Viewer was used for data visualization (36). We scanned the motif ACA (and the reverse complementary motif TGT) on the reference genome and mapping result file, respectively, counting the number of ACA/TGT at the internal or terminal of a sequence read. As short sequence reads with lengths less than 15 nt were filtered, only ACA/TGT with the distance from the end of a read more than 15 nt was treated as “internal” of a read. Each ACA/TGT motif on the reference genome, which has more than 10 reads supporting the motif located internal of reads, was treated as a candidate m6A site. The ratio of sequence reads with internal ACA versus reads split at the end represents the relative methylation ratio of each site. Candidate sites with repeated or continuous ACA motifs were removed.

As the RNA secondary structure could affect the reaction efficiency, we predicted the probability of each RNA fragment forming secondary structure and removed the candidate sites tending to reside in double-stranded regions. The RNAfold program of ViennaRNA package (30) was used to predict intramolecular secondary structure of each read, which supported the candidate sites. The condensed representation of the pair probabilities of each nucleotide was parsed according to the tutorials of ViennaRNA package. The pairing probability value of NNACA motif was calculated, and candidate sites with high pairing probability were discarded. The FTO demethylation treatment was also treated as a negative control. Candidate sites with at least 10% methylation ratio decrease were remained as m6A sites. The metagene plot of m6A sites, which was described the relative location of each site on the 3′UTR, CDS, and 5′UTR of mRNA, was calculated and plotted as well. The raw reads of MeRIP-seq datasets were downloaded from Meyer et al. (12) (GSE29714) and mapped to human (hg19) reference genome using HISAT2. The reads within the range of −400 to 400 nt flanking to every m6A site were plotted. As the sequencing data were single end, we extended the reads to 200 nt to simulate the IP peak.

Note: The content above has been extracted from a research article, so it may not display correctly.



Q&A
Please log in to submit your questions online.
Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.



We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.