Population Structure and Admixture Analysis Using CHROMOPAINTERv2, fineSTRUCTUREv4, and GLOBETROTTER

BB Brian Martin Babigumira
JS Johann Sölkner
GM Gábor Mészáros
CP Christina Pfeiffer
CL Craig R. G. Lewis
EO Emily Ouma
MW Maria Wurzinger
KM Karen Marshall
request Request a Protocol
ask Ask a question
Favorite

To support the ADMIXTURE and MDS analysis, we analyzed the data using the CHROMOPAINTERv2/fineSTRUCTUREv4 pipeline supported by the Perl scripts provided with the programs (Lawson et al., 2012). The data was phased using SAHPEIT2 (Delaneau et al., 2013). First, a custom R script (Team R Core, 2020) was run to prepare the genetic maps for each chromosome, as required by SHAPEIT2 based on the Sus scrofa recombination map (Tortereau et al., 2012). We run QC (–geno 0.2) and split the data by all autosomal chromosomes using PLINK1.9 (Chang et al., 2015). To achieve a successful run with the provided QC measures (considering size of individual populations and number of variants), we included the –force flag in the SHAPEIT2 command line. We run the impute2chromopainter.pl script to transform the SHAPEIT2 files into the phase format usable by Chromopianterv2. Next, we run the convertrecfile.pl script to generate recombination files using as inputs, the phase files from the previous run and genetic maps based on the Sus scrofa recombination map (Tortereau et al., 2012). We used the default settings for both scripts and specified the HapMap format when using the latter. We then run the phase and recombination files in CHROMOPAINTERv2 (Lawson et al., 2012) twice; the first run was to estimate nuisance parameters and the second one was to generate the co-ancestry matrix using the linked model. The Estimation-Maximization (E-M) iteration was run in automatic mode (“fs”) with the entire dataset for all autosomal chromosomes. Basically, each animal was conditioned on the others in 10 E-M iterations using a sample of ten animals. The main output were two inferred nuisance parameters (Ne, somewhat similar to effective population size and mu, the mutation/switch rate) (Hellenthal, 2012). These parameters (Ne = 34.7106 and mu = 0.00500584) were fixed in the CHROMOPAINTERv2 algorithm in the second run. The main outputs were estimation of the c-factor (effective number of chunks; c = 0.17931) and copying vectors. These outputs were fed into the Bayesian clustering algorithm of fineSTRUCTUREv4 for all autosomes.

To further investigate the admixture in the Ugandan pig population used in this study, we exploited the analytical capabilities of GLOBETROTTER (Hellenthal et al., 2014). The Bayesian clustering algorithm of fineSTRUCTUREv4 identified 40 clusters, which, when grouped, were generally not different from our labeled data or the output from ADMIXTURE1.3. Therefore, we run GLOBETROTTER to identify, date and describe admixture in the Uganda pigs using as surrogates: MS, DRC, IB, Modern European (CMB, LR, and LW) and Old British (SB and LB) and LOC with KAM or HOI as target (recipient) populations (Hellenthal et al., 2014; Hellenthal, 2020). We ran GLOBETROTTER with default settings for all parameters except “prop.ind,” “bootstrap.date.ind,” and “null.ind.” For the first run, we set “bootstrap.date.ind” to 0 and the other two to 1. In the second run, we set “prop.ind” to 0 and the other two to 1. For the third run, we set “null.ind” to 0 and the other two to 1 (Hellenthal, 2020). Here, we report the results from the last run.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

post Post a Question
0 Q&A