Detection of candidate adaptive loci

MC Mariana Vargas Cruz
GM Gustavo Maruyama Mori
CS Caroline Signori-Müller
CS Carla Cristina da Silva
DO Dong-Ha Oh
MD Maheshi Dassanayake
MZ Maria Imaculada Zucchi
RO Rafael Silva Oliveira
AS Anete Pereira de Souza
request Request a Protocol
ask Ask a question
Favorite

We sampled leaves from 79 adult plants at ten locations spanning most of the distribution of A. schaueriana Stapf & Leechman ex Moldenke (Fig. 1, Supplementary Information Table S2). We isolated DNA using the DNeasy Plant Mini Kit (QIAGEN) and NucleoSpin Plant II (Macherey Nagel) following the manufacturers’ instructions. DNA quality and quantity were assessed using 1% agarose electrophoresis and the QuantiFluor dsDNA System with a Quantus fluorometer (Promega). Nextera-tagmented reductively-amplified DNA (nextRAD) libraries45 were prepared and sequenced by SNPsaurus (SNPsaurus) on a HiSeq. 2500 (Illumina, Inc.) with 100-bp single-end chemistry. Genomic DNA fragmentation and short adapter ligation were performed with the Nextera reagent (Illumina, Inc.) followed by amplification with one of the primers matching the adapter and extending nine arbitrary nucleotides into the genomic DNA. Assembly, mapping and single nucleotide polymorphic loci (SNP) identification were performed using custom scripts (SNPsaurus), which created a reference catalogue of abundant reads across the combined sample set and mapped reads to this reference, allowing two mismatches and retaining biallelic loci present in at least 10% of the samples. We further filtered markers by allowing no more than 65% missing data, a Phred score > 30, 8x minimum coverage, only one SNP per locus and a minor allele frequency ≥0.05 using vcftools v.0.1.12b46. To reduce paralogy or low-quality genotype calls, we used a maximum read coverage of 56 (the average read depth times 1.5 standard deviation).

After excluding plants morphologically identified as A. schaueriana with genomic signs of hybridization with A. germinans (L.) L., we assessed the genetic structure considering all SNPs, using the discriminant analysis of principal components (DAPC)47 and ADMIXTURE v.1.3.048. For DAPC analyses, we considered the number of groups (K) varying from 1–50 and the Bayesian information criteria for inferring K. Additionally, we used the optim.a.score function to avoid overfitting during the discrimination steps. For the ADMIXTURE analyses, we performed three separate runs for K varying from 1–15 using the block-relaxation method for point estimation; computing was terminated when estimates increased by <0.0001, and the most likely K was determined by cross-validation.

We used two programs to minimize false-positive signs of natural selection: LOSITAN49, assuming an infinite allele model of mutation, using a confidence interval of 0.99, a false-discovery rate (FDR) of 0.1, the neutral mean FST and forcing of the mean FST options; and pcadapt 3.0.450, which simultaneously identifies the population structure and the loci excessively related to this structure, using an FDR < 0.1.

Putative evidence of selection was considered only for SNP loci that were conservatively identified by pcadapt and five independent runs of LOSITAN to avoid false-positives51. As selection is presumably stronger in coding regions of the genome and there is no reference genome for the species, we used the de novo assembled transcriptome, characterized herein, as a reference to identify candidate loci within putative coding regions. We performed a reciprocal alignment between nextRAD sequences (75 bp) and longer expressed sequences (≈300–11600 bp) using blastn v.2.2.3136, with a threshold of at least 50 aligned nucleotides, a maximum of one mismatch and no gaps.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A