request Request a Protocol
ask Ask a question
Favorite

According to PacBio’s protocol, the raw polymerase reads were first processed using SMRTlink 5.0 software. Briefly, after removing the SMRTbellTM adapter and the low-quality data, postfilter polymerase reads were obtained. The CCS was generated from the subreads BAM files, also known as the reads of insert (ROIs). All the ROIs whose the number of full passes was >1 were further classified into full-length (FL) and non–full-length (nFL) transcript sequences based on whether the 5′ primer, 3′ primer, and poly A tail could be simultaneously observed. We employed a three-step strategy for error correction to improve the accuracy of the full-length transcripts produced by the PacBio Iso-Seq platform. First, the circle sequencing with >1 pass provided an opportunity for ROI self-correction. Second, full-length, non-chimeric (FLNC) reads were subjected to non-redundant and clustering treatments by the ICE Quiver algorithm and to arrow polishing with the nFL sequence, producing high-quality and polished full-length consensus sequences. Finally, these polished consensus sequences were further subjected to correction and redundancy removal with Illumina short reads using the Proovread tool and the CD-HIT program with a–c 0.99 parameter cutoff, respectively. The above three corrections resulted in non-redundant, non-chimeric, full-length transcripts (isoform level) with high accuracy for subsequent analyses.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A