Admixture proportions for each population were inferred based on GL using NGSadmix32. A Beagle file, using the same filters to investigate population structure with PCAngsd, was taken and randomly thinned to contain one million sites for computational practicality. We ran NGSadmix with K = 2 to K = 9 until the model converged, where the top 3 maximum likelihood runs were within 10 log-likelihood units of each other or until a limit of 4000 independent runs was reached without convergence. K = 9 did not converge after 4000 independent runs, likely constrained by the number of samples per population. Model-based analyses of population structure make a set of assumptions about the data (e.g., individuals are unrelated, are in HWE, exhibit no LD, and that each ancestral population is represented by multiple unadmixed individuals with no subsequent drift). Therefore, we calculated the correlations of residuals using evalAdmix33 for each pair of individuals to evaluate model fit and to test whether the data violated some of these assumptions for K ancestral clusters.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.