Genome assembly of LM72, LM04 and LM60 was performed with CANU v1.8 [49] using the sequenced Nanopore reads with default settings and an estimated genome size of 35 Mb. Two rounds of correction were applied to the resulting assemblies: the first round was performed using Nanopolish v. 0.13.2 [50], and the second round used Pilon v1.23 [51], which corrected the nanopolished CANU assemblies using Illumina reads that were mapped with BWA v0.7.17 [52]. Assemblies were annotated using the Funannotate v1.8.8 pipeline [53] using default settings, with predicted proteins from C. purpurea isolate 20.1 supplied as protein evidence to assist gene modeling. Libraries of repetitive elements were generated using RepeatModeler2 [54], and were identified, where possible, using the 2018 Repbase database of annotated transposons [55]. Repetitive elements in intergenic regions between lpsA1 and lpsA2 were annotated by searching for blast hits in the RepeatModeler2-generated repetitive element libraries. Illumina-based genome assemblies for LM04, LM30, LM60, LM207, LM232, LM233, LM464 and LM479 were previously published and are publicly available in the NCBI database [30].
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.