Population Differentiation

PW Pamela Wiener
CR Christelle Robert
AA Abulgasim Ahbara
MS Mazdak Salavati
AA Ayele Abebe
AK Adebabay Kebede
DW David Wragg
JF Juliane Friedrich
DV Deepali Vasoya
DH David A Hume
AD Appolinaire Djikeng
MW Mick Watson
JP James G D Prendergast
OH Olivier Hanotte
JM Joram M Mwacharo
EC Emily L Clark
request Request a Protocol
ask Ask a question
Favorite

The population-branch statistic (PBS) (Yi et al. 2010) is designed to identify population-specific allele frequency changes, in this case, to identify alleles associated with adaptation to high altitude. A subset of five closely related populations sampled at high- and low-altitude (high, 2,610–2,783 m: AKD, AKR, MZ; low, 740–859 m: FKD, FSG) was selected for analysis in order to limit overall between-population differences, as suggested by Yi et al. (2010). The three high- and two low-altitude populations were pooled into two groups in order to improve the power of the analysis. PBS, a function of pairwise FST values, was calculated on these populations and the population from Libya (LBR) as an outgroup. First, pairwise FST (Weir and Cockerham 1984) was calculated for all markers between the pooled high-altitude, pooled low-altitude and LBR populations using vcftools (option: –weir-fst-pop) (https://vcftools.github.io/index.html) (Danecek et al. 2011). Negative FST values were set to 0. PBS (PBS raw) was calculated using these FST values, as described in Yi et al. (2010). To control for random variation at individual sites, means and medians of PBS (PBS mean, PBS median) were also calculated for 9-SNP windows across the genome. This window definition was found to be most or equally suitable for estimation of local genomic diversity, by balancing capture of extreme signals and removal of stochastic effects, of in comparison with 11- and 13-SNP windows or windows based on physical size (results not shown). The number of markers for which FST was calculated (3,068,678) was reduced from the initial data set due to fixation of the same allele across the five populations (but not across all of the 12 populations, as filtered in QC procedures). The number of markers was further reduced for the PBS analysis since additional markers were removed that were missing FST values for high-altitude versus LBR or low-altitude versus LBR comparisons. Final numbers of markers were the following: 3,007,909 (PBS raw) and 2,421,841 (PBS mean and PBS median). Genes located less than 100 kb of SNPs in the top 0.00001 proportion of PBS statistics were catalogued. Genome-wide average pairwise FST values (Weir and Cockerham 1984) were also calculated for all pairs of populations using vcftools, as described above. Negative FST values were again set to 0. An unrooted neighbor-joining tree (Saitou and Nei 1987) was reconstructed based on these FST values using Phylip (Felsenstein 2005).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A