2.6. Whole Genome Sequencing and Genome Assembly and Analysis

Abasiofiok Ibekwe; Lisa Durso; Thomas F. Ducey; Adelumola Oladeinde; Charlene R. Jackson; Jonathan G. Frye; Robert Dungan; Tom Moorman; John P. Brooks; Amarachukwu Obayiuwana; Hiren Karathia; Brian Fanelli; Nur Hasan

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

2.6. Whole Genome Sequencing and Genome Assembly and Analysis

AI Abasiofiok Ibekwe

LD Lisa Durso

TD Thomas F. Ducey

AO Adelumola Oladeinde

CJ Charlene R. Jackson

JF Jonathan G. Frye

RD Robert Dungan

TM Tom Moorman

JB John P. Brooks

AO Amarachukwu Obayiuwana

HK Hiren Karathia

BF Brian Fanelli

NH Nur Hasan

This method is extracted from research article: Microorganisms, May 2021

Diversity of Plasmids and Genes Encoding Resistance to Extended-Spectrum β-Lactamase in Escherichia coli from Different Animal Sources

DOI: 10.3390/microorganisms9051057

Request a Protocol

Ask a question

Favorite

Genomic DNA was extracted with the QIAamp DNA Mini Kit and plasmid with Qiagen Plasmid Mini Kit (Qiagen, Valencia, CA, USA). Samples were quantified using a fluorometer Qubit 3.0 and each sample was normalized in 3–18 µL of nuclease-free water for a final concentration of 0.5 ng µL⁻¹ using the Biomek FX liquid handler (Beckman Coulter Life Sciences, Brea, CA, USA). Libraries were then constructed using the modified Nextera XT protocol (Illumina, San Diego, CA, USA) as previously described [42]. PCR products were purified using 1.0× speed beads and eluted in 15 µL of nuclease-free water and quantified by PicoGreen fluorometric assay (100× final dilution). The libraries were pooled by adding an equimolar ratio of each based on the concentration determined by PicoGreen, and loaded onto a high sensitivity (HS) chip run on the Caliper LabChipGX (Perkin Elmer, Waltham, MA, USA) for size estimation, followed by 150 bp paired end sequencing using Illumina HiSeq v3 chemistry (Illumina, San Diego, CA, USA). Sequencing reads were directly analyzed using the CosmosID bioinformatics software package (CosmosID Inc., Rockville, MD, United States) as described previously [43,44,45,46].

Raw sequencing data were trimmed, and de novo assembled using the SPAdes assembler (http://bioinf.spbau.ru/spades accessed on 18 November 2018 [47]) and plasmSPAdes accessed on 18 November 2018 [48] using default parameters to construct each genome. Contigs less than 200 nucleotides were excluded from the analysis. Assembled contigs were submitted to the Center for Genomic Epidemiology’s ResFinder [49] and CARD for the identification of resistance genes carried on plasmids or chromosome [50], and to determine the incompatibility (inc) group of the plasmid carrying an ARG of interest. Contigs were also submitted to PlasmidFinder [51] to determine existing plasmid replicon types, and steps previously described [52]. A phylogenetic tree of the sequenced E. coli genomes, along with additional reference E. coli genomes, was constructed using the parsnp program (Harvest software) [53] which identifies core genomes across isolates and builds a phylogeny using maximum likelihood and core single nucleotide polymorphisms (SNPs). Sequence typing of each genome was performed using MLSTcheck developed by the Sanger Institute, using the pubMLST database (https://pubmlst.org/ accessed on 18 November 2018) as described elsewhere [54]. Draft genomes were submitted to NCBI Short Read Archive under the bio-project #PRJNA492317 (http://www.ncbi.nlm.nih.gov/bioproject/492317, accessed on 18 November 2018). Using Illumina sequencing, there are limitations with a short read assemble in that it is difficult to resolve the entire plasmid into one contig [55,56,57,58]. Consequently, a plasmid is broken down into multiple contigs including the region used for determining plasmid incompatibility group (incRNAi).

Draft assemblies were interrogated against CosmosID acquired antibiotic resistance gene and virulence gene databases using the BLASTN (v.2.7) tool. The best-matching genes were identified using a threshold of >90% identity and >60% alignment coverage of the reference gene. When the incRNAi-rep region was absent in a contig carrying AR, then it was not possible to determine the plasmid inc group. Protein annotation of contigs were performed using a Prokka [59] and PSI-BLAST search against the National Center for Biotechnology Information (NCBI) database. The genetic context of bla_TEM genes was determined using linear maps of contigs drawn using SnapGene ^®.

MAFFT v. 1.4.0^ref and RAxML v. 4.0 [60] implemented in Geneious Prime^® v 2020.0.1 were used for aligning bla_CMY-2 plasmid contigs and for reconstructing their maximum likelihood (ML) tree. The GTR + GAMMMA model was used for building the tree implemented. Lastly, to determine the consensus sequence for incA/C2, i.e., incC plasmid present in ARS-isolate-13, we aligned its assembled whole genome against the closest IncC reference genome found on NCBI (Genbank number: CP051316, query cover = 98%; identity = 99.99%) using Geneious Prime^® mapper (settings—high sensitivity). Contigs matching the incC reference genome (# = 13) were ordered and annotated with the Rapid Annotation using Subsystem Technology (RAST) [61,62,63]. Virulence genes encoded on incC were determined using VirulenceFinder^ref. A linear map of IncC was built using the SnapGene ^® viewer v. 5. 2.3.

Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol