LncRNA identification pipeline

YW Yuqian Wu
TC Tingcai Cheng
CL Chun Liu
DL Duolian Liu
QZ Quan Zhang
RL Renwen Long
PZ Ping Zhao
QX Qingyou Xia
request Request a Protocol
ask Ask a question
Favorite

We developed an analysis pipeline to identify bona fide lncRNAs from the newly generated silkworm transcriptome (Fig 1). (1) Transcripts that overlapped with any protein-coding exon in the sense orientation were removed; (2) transcripts with < 200 bp, single-exon, read coverage < 0.8, and FPKM < 0.1 were eliminated; (3) transcripts with predicted large ORFs (> 100 aa) were filtered out; (4) transcripts with predicted protein-coding potential were removed (protein-coding potential criteria: CPC score > 0, CPAT score > 0.345, and CNIC score > 0) [4850]; (5) transcripts with similarity to known protein sequences in the Swiss-Prot database (E-value < 1e-6) [51] and known protein-coding domains in the Pfam (AB) database (E-value < 1e-6) [52] were discarded; (6) transcripts within the < 2k scaffold-end range were excluded; (7) finally, transcripts with class code ‘i’,‘u’,’x’ subsets were retained as bona fide silkworm lncRNAs.

FPKM, Fragments per kilobase of transcript per million mapped reads; ORF, open reading frame; CPC, coding potential calculator; CPAT, RNA coding potential assessment tool; CNIC, coding non-coding index.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

post Post a Question
0 Q&A