ddRADseq library preparation and SNP discovery

TS Thomas L. Schmidt
IF Igor Filipović
AH Ary A. Hoffmann
GR Gordana Rašić
request Request a Protocol
ask Ask a question
Favorite

We extracted genomic DNA using Roche DNA Isolation Kit for Cells and Tissues (Roche, Pleasanton, CA, USA), with an additional step of RNase treatment. Seventy-four of the 110 ovitraps produced Ae. aegypti imagoes, and we selected 161 individuals for sequencing. As we expected ovitraps to contain many full-siblings from the same oviposition (Hoffmann et al. 2014), which can bias analyses of population structure (Goldberg and Waits 2010), we sequenced no more than three individuals per ovitrap.

We applied the method of Rašić et al. (2014) for ddRADseq library preparation, but selected a smaller size range (350–450 bp) of genomic fragments to accommodate more individuals per library. We generated three libraries each consisting of 60–61 individuals, which were sequenced in three Illumina HiSeq2500 lanes using 100 bp paired-end chemistry. We processed raw fastq sequences within a customised pipeline (Rašić et al. 2014), retaining reads with phred scores ≥ 13 and trimming them to 90 bp. High-quality reads were aligned to the Ae. aegypti nuclear genome assembly AaegL1 (Nene et al. 2007) using Bowtie (Langmead et al. 2009). We allowed for up to three mismatches in the alignment seed, and uniquely aligned reads were analysed using Stacks (Catchen et al. 2013), which we used to call genotypes at RAD stacks of a minimum depth of five reads.

We used the Stacks programme populations to export VCF files, which we filtered using VCFtools (Danecek et al. 2011). We kept 134 individuals with < 20% missing data, and retained loci at Hardy–Weinberg equilibrium (81% of loci) and with minor allele frequencies > 0.05 (average frequency in retained loci = 0.21). We applied thinning to ensure no SNP was within 250 kbp of another. As Aedes genome contains approximately 2.1 megabases per cM (Brown et al. 2001), 250 kbp roughly corresponds to eight SNPs per map unit, a sampling density shown to largely eradicate linkage effects in SNPs (Cho and Dupuis 2009). We retained 3784 unlinked and informative SNPs for analyses of relatedness and genetic structure.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A