Population genomic analyses

Leonardo Campagna; Márcio Repenning; Luís Fábio Silveira; Carla Suertegaray Fontana; Pablo L. Tubaro; Irby J. Lovette

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Population genomic analyses

LC Leonardo Campagna

MR Márcio Repenning

LS Luís Fábio Silveira

CF Carla Suertegaray Fontana

PT Pablo L. Tubaro

IL Irby J. Lovette

This method is extracted from research article: Sci Adv, May 2017

Repeated divergent selection on pigmentation genes in a rapid finch radiation

DOI: 10.1126/sciadv.1602404

Request a Protocol

Ask a question

Favorite

We searched for divergent areas of the genome by calculating F_ST values using VCFtools version 0.1.14 (50) and the five southern capuchino species with sample sizes of 12 individuals (hypox, mel, nig, pal, and pil). We calculated F_ST in three different ways across the 10 possible pairwise comparisons involving five species (for example, nig versus mel, hypox versus pal). We used three strategies: (i) calculated average F_ST values for nonoverlapping 25-kb windows, (ii) zoomed in to scaffolds of interest and calculated average F_ST values for nonoverlapping 5-kb windows, and (iii) calculated F_ST values for individual SNPs. We built Manhattan plots and conducted PCA in R version 3.3.0 (51) with the packages “qqman” and SNPRelate version 3.3 (52), respectively. The PCA derived from 11.5 million SNPs was run both with and without four outlier individuals (two S. melanogaster and two S. pileata; see details in fig. S1). Downstream analyses were conducted with and without these four individuals and produced similar results.

We identified divergence peaks in the 10 pairwise comparisons using the average F_ST value calculated for the nonoverlapping 25-kb windows, discarding regions with less than two windows and windows with less than 10 SNPs. We took a conservative approach and only selected regions that showed an F_ST value elevated above 0.2. Because the average F_ST across all comparisons was 0.008, these criteria only selected regions that fall between 12 and 13 SDs above the F_ST mean. We subsequently narrowed our selection of candidate regions by retaining only those that had at least one individual SNP with an F_ST of 0.85 or higher. We thus filtered out regions with an elevated average F_ST that did not contain individual outlier sites that could be putative targets of selection. We identified a total of 25 divergent regions across the 10 possible pairwise F_ST comparisons.

We estimated absolute sequence divergence by calculating the summary statistic Dxy for each site and obtaining an average for nonoverlapping 5-kb windows with a custom perl script. Dxy was calculated as the minor allele frequency in species A times the major allele frequency in species B plus the product of the major allele frequency in species A and the minor allele frequency in species B. The per-site minor allele frequency was obtained using AGSD version 0.911 (53).

We estimated LD using VCFtools to calculate the r² statistic. The calculations were carried out with the 99 SNPs that showed fixed differences (F_ST = 1) in at least one pairwise comparison between species. We recorded the average and the highest r² value when comparing more than one pair of sites between two peaks. Calculations were conducted for each species separately and for all taxa pooled together. For the former, we included one outgroup from each of the remaining species because, in many cases, the position was not variable within species and otherwise could not be used to calculate LD.

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol