The library for genomic-DNA sequencing was prepared according to the TruSeq DNA sample preparation protocol (Illumina). Briefly, 1 μg of genomic DNA was sonicated into fragments with a median length of 400 bp; after end repair, indexed adapters were ligated at the DNA fragment ends, and the libraries were quantified by quantitative real-time PCR (qPCR) using Kapa Library Quant kits (Kapa Biosystems). After a short amplification step, the library was sequenced on an Illumina GAIIX sequence analyzer to generate 85-bp paired-end reads. The raw reads were individually mapped to the E. coli MC4100 genome (RefSeq accession number HG738867) using the accurate alignment BWA mem algorithm (81) allowing 1% error. Removal of duplicated reads was performed with SAMtools (82); only high-quality reads having mapping quality score (MAQ) values of >30 were used for the analysis of variant detection. A VCF file containing all the variants for each sample relative to E. coli MC4100 was obtained by using SAMtools and Bcftools (82) and filtered for low-quality variants. SNVs having coverage lower than five high-quality reads were discarded. Predicted indel mutations having coverage lower than six high-quality reads were discarded. Then, the VCF files were analyzed using SNPeff version 4.0 (83), and high-quality SNVs and indels were subsequently annotated to determine their effects and impacts on coding sequences.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
 Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.