The reads were co-assembled, i.e., we used the reads from all (parental and hybrid) lines that passed the quality filter to construct a de novo reference transcriptome. We ran Trinity55 version r2013_08_14 with the default parameters and a group_pairs_distance of 600. Thus, these transcripts are consensus transcripts. The procedure is described in Fig. S1). From the Trinity assembler we obtained several sequences. Some of them correspond to the same gene, for example due to alternative splicing events, and Trinity will consider them as a component (for example component1 will have sequence 1 and 2). Moreover, Trinity may also produce an independent sequence from the same gene (that will be named component2, with sequence 1). The alignment of the sequences against a reference genome allow to merge components that belong to the same gene.
This approach is possible because the two parental lines diverged recently, so we assumed that the transcripts of the species and the hybrids are similar enough to be assembled together. This method has the effect of increasing the sequencing depth and allows us to better assemble transcripts that are too lowly expressed in one or more species and that could not be assembled otherwise, which can be the case for TEs, which can be lowly expressed in parental lines. Additionally, unlike the mapping method, this approach has no bias in favor of D. mojavensis.
To validate this approach, we compared the results obtained by the co-assembly with those obtained from the single assemblies of each species and hybrids. We aligned the components obtained from the single assemblies to the one from the co-assembly and considered them associated if they mapped with at least 80% identity with 80% of query coverage (the query corresponds to the components from the single assemblies). For the single assemblies and the co-assembly, we estimated the number of chimeric components by the number of components that did not align on the reference genome of D. mojavensis with at least 80% identity and 80% of query coverage.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.