To select the right reference from the six poplar cpDNAs, two steps were performed. Firstly, to estimate the sequence similarities between these cpDNAs, pairwise genome alignments were conducted using LASTZ (v1.03.28, http://www.bx.psu.edu/~rsharris/lastz/) with at least a 95% identity. Secondly, to infer the efficiency of each poplar cpDNA as a reference genome, the number of the identified cpDNA reads was calculated based on the alignment of the simulated short reads against each poplar cpDNA. The split-reads method, which split the reference sequence into many 100-bp fragments with a 1-bp sliding window, was used to generate simulated short reads from these poplar cpDNAs. Mapping the short reads against cpDNAs for each of the six Populus species was performed using BWA (v0.7.12) and Bowtie (v1.1.1) with default parameters [23, 24].
To evaluate the feasibility of the reference-assisted strategy for the poplar chloroplast genome assembly, we utilized the reads simulator wgsim (v0.3.0, https://github.com/lh3/wgsim) to simulate and generate Illumina-like paired-end reads in an attempt to assemble poplar cpDNA from Illumina-like reads. Based on the above findings, the P. trichocarpa chloroplast genome (NC_009143) was chosen as a reference sequence for the wgsim simulator. The duplicates of the simulated paired reads were identified and discarded with FastUniq (v1.1) [25]. The genome size for the Illumina-like reads generated by wgsim was predicted using KmerGenie (v1.6982) [26]. The simulated reads were de novo assembled into the cpDNA by Minia (v2.0.3) [27]. These Minia assemblies were assessed using the assembly evaluation tool QAUST (v3.1) with P. trichocarpa cpDNA as a reference genome [28].
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.