Dataset Construction

LL Lian Liu
XL Xiujuan Lei
ZF Zengqiang Fang
YT Yujiao Tang
JM Jia Meng
ZW Zhen Wei
request Request a Protocol
ask Ask a question
Favorite

For predicting the m6A methylation sites in lncRNA, we employed the ground truth data that was used in the WHISTLE project (Zhang et al., 2019), including six single-base resolution m6A experiments from six datasets obtained from five cell types (see Table 1): HEK293T, MOLM13, A549, CD8T, and HeLa, respectively, where HEK293T has two samples. The annotation information of lncRNA was obtained through Bioconductor via the TxDb.Hsapiens.UCSC.hg19.lincRNAsTranscripts R package. The positive m6A sites were defined as under the DRACH consensus motifs in at least two of the six datasets. The negative m6A sites were randomly selected from the non-positive DRACH adenosines on the full transcripts containing the positive sites. There were equal numbers of negative and positive sites for each set of the training data, and the underlying motifs were restricted on DRACH. In addition, no sites were reported from the regions that can be mapped to multiple genes.

Single-base resolution m6A datasets in lncRNA m6A prediction.

Finally, 2,582 full transcript m6A sites in lncRNA were collected, including 1,291 positive sites and 1,291 negative ones, while 2,214 m6A sites were obtained in mature lncRNA mode with 1,107 positive sites and 1,107 negative ones. Four-fifths of the sites were randomly selected for training, and the rest was retained for testing under both full transcript and mature RNA modes, respectively. For comparison purposes, we also generated the matched data for mRNAs, including 57,105 positive sites and the same number of negative ones for the full transcript mode, and 54,476 positive sites and 54,476 negative ones for the mature RNA mode, respectively. There were many more mRNA methylation sites compared with the lncRNA sites, suggesting that the mRNA methylation sites usually dominate the epitranscriptome profiling results.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A