Illumina sequencing generated 117,478,076 paired-end reads from the pooled cDNA libraries (Table 1). Sequence filtering was performed in CLC Genomics Workbench 12 software (Qiagen, Denmark). Adapter sequences, low quality reads (Phred score ≤ 30) and reads shorter than 50 bp were removed. The resulting filtered reads were assembled using the de Bruijn graph-based de novo assembler of CLC Genomics [2]. Assembly parameters: k-mer and bubble size were varied to optimize the assembled contigs. The final assembly (minimum contig length = 200 bp) was done with k-mer = 35, and bubble size = 300, which was based on the output parameters: high N50, low total number of contigs, high average contig length and high percentage of reads mapped back to transcripts. The cluster tool cd-hit-est with a sequence identity threshold of 0.95 was used for redundancy filtration of the assembly [3]. Numbers of reads mapping back to the contigs were converted to transcripts per million (TPM) expression values [4] to estimate the transcript abundance. In the initial data investigation, we also performed a principal component analysis and global Pearson correlation analysis to test the significance of the clusters and correlation between samples.

Note: The content above has been extracted from a research article, so it may not display correctly.



Q&A
Please log in to submit your questions online.
Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.



We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.