To ensure the quality of further analytical data, raw reads were filtered using in-house Perl scripts (Annoroad Gene Technology Co., Ltd, Beijing, China) to obtain high-quality reads. The filtering step involved deleting reads with low quality (Phred quality score <5%), adapter contamination, matches to rRNA sequence, or a rate of ambiguous bases higher than 5% (32). The Phred quality score refers to the rate of sequencing errors for a given base; for example, Q30 indicates that the base sequencing error rate was 0.1%. We aligned paired-end clean reads to the oar4.0 sheep reference genome (https://www.ncbi.nlm.nih.gov/assembly/GCF_000298735.2) using HISAT2 (v.2.0.5) (33) with the parameters “–rna-strandness RF” and “–dta -t -p 4”. Only the uniquely mapped reads were assembled, and the expression levels were predicted using String Tie (v.1.3.2d) (33) with the parameter “-G ref.gtf -rf -1”. Thereafter, we calculated the number and ratio of the uniquely mapped reads within the three gene functional elements: exons, introns, and intergenic elements. Homogeneity analysis was subsequently performed to guarantee that the sequencing results did not impact further transcriptome analysis. Thus, mRNA and known lncRNA transcripts were identified.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.