Genome Assembly, Polishing, and Assessment

Xiao Xiong; Yogeshwar D Kelkar; Chris J Geden; Chao Zhang; Yidong Wang; Evelien Jongepier; Ellen O. Martinson; Eveline C Verhulst; Jürgen Gadau; John H Werren; Xu Wang

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Genome Assembly, Polishing, and Assessment

XX Xiao Xiong

YK Yogeshwar D Kelkar

CG Chris J Geden

CZ Chao Zhang

YW Yidong Wang

EJ Evelien Jongepier

EM Ellen O. Martinson

EV Eveline C Verhulst

JG Jürgen Gadau

JW John H Werren

XW Xu Wang

This method is extracted from research article: Front Genet, Nov 2021

Long-Read Assembly and Annotation of the Parasitoid Wasp Muscidifurax raptorellus, a Biological Control Agent for Filth Flies

DOI: 10.3389/fgene.2021.748135

Request a Protocol

Ask a question

Favorite

The raw sequencing reads (Aub sample) from both PacBio library and 10× Genomics library were checked for sequencing quality using FastQC (Andrews et al., 2010) before genome assembly. De novo genome assembly for the M. raptorellus Aub sample was performed by a Supernova 2.1.1 (Weisenfeld et al., 2017) assembler using 400 million reads subsampled from the total amount of reads generated from the 10× Genomics library. Filtered HiFi PacBio reads were assembled by hifiasm v0.13 (Cheng et al., 2021) and HiCanu v2.1.1 (Nurk et al., 2020), dedicated assemblers using long-read sequencing. The Kop CLS PacBio data were assembled using Canu v2.1 (Koren et al., 2017). The Kop CANU assembly was polished with Pilon (version 1.22; parameter settings: fix = all) (Walker et al., 2014) to correct small errors based on high-quality 150 bp paired-end Illumina short reads (Table 1). A final round of polishing with Arrow (VariantCaller version 2.1.0) was performed to correct large structural errors, based on the raw PacBio reads that were aligned with Minimap2 (Li, 2018). Aub and Kop cultures have identical mitochondrial genomes (100% sequence identity) with only one 11 bp indel. The Aub 10× Genomics reads were aligned to the repeat-masked Kop assembly using the Longranger v2.1.6 (Zheng et al., 2016) software suite with the ALIGN pipeline. 58,350 SNPs were called by UnifiedGenotyper in the Genome Analysis Toolkit (GATK) (McKenna et al., 2010; DePristo et al., 2011). SNP positions in repetitive regions and variants outside the coverage depth threshold (120–500 bp) were filtered out using BEDTools v2.30.0 (Quinlan, 2014). A total of 11,523 homozygote SNPs between Aub and Kop were identified, and the percentage of fixed differences in the nuclear genome was estimated to be 0.0038%. To achieve the best assembly, these draft assemblies with different assemblers from both Aub and Kop samples were merged into a draft assembly using an assembly combination tool quickmerge v0.3.0 (Chakraborty et al., 2016). Potential bacterial contaminations were checked using a pipeline described in our previous research (Wang et al., 2020), and no bacteria contig contamination was discovered. The draft assembly was polished to yield a final high-quality assembly with the 10× Genomics Illumina short reads for indel correction using Pilon v1.23.0 (Walker et al., 2014). The final genome assembly was evaluated based on the N50 size of contigs and RNA-seq read mapping percentages, and genome completeness was assessed by BUSCO version 4.0.6 (Seppey et al., 2019). The BUSCO scores were calculated using arthropoda_odb10 with a total of 1,013 orthologs.

Summary statistics of the Muscidifurax raptorellus genome assemblies.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol