Detection of Genetic Introgression between Breeds

LB Laura Buggiotti
AY Andrey A Yurchenko
NY Nikolay S Yudin
CJ Christy J Vander Jagt
NV Nadezhda V Vorobieva
MK Mariya A Kusliy
SV Sergei K Vasiliev
AR Andrey N Rodionov
OB Oksana I Boronetskaya
NZ Natalia A Zinovieva
AG Alexander S Graphodatsky
HD Hans D Daetwyler
DL Denis M Larkin
request Request a Protocol
ask Ask a question
Favorite

We applied a robust forward–backward algorithm implemented in RFMix (Maples et al. 2013) to screen for the presence of putative taurine or indicine haplotypes in autosomes of the Yakut cattle. This algorithm uses designated reference haplotypes to infer local ancestry in designated admixed haplotypes; thus, five genetic groups were selected as a reference panel: European taurine (Holstein), Russian taurine (Kholmogory), African indicine (Ogaden), Chinese indicine (Wannan, Ji’an, and Leiquiong), and Indian indicine (Brahman). Window size was set to three (-w 3) and the option “–reanalyze-reference” with three iterations was used to analyze the reference haplotypes as if they were query haplotypes (Maples et al. 2013).

We conducted TreeMix v. 1.12 (Pickrell and Pritchard 2012) analyses to infer the relationships, divergence, and major mixtures among 18 cattle breeds and five Bovinae species (supplementary information 1, Supplementary Material online). We applied the option “-root YAK,” which sets Yak as the position of the root, option “-k 1000,” which builds the tree using blocks of 1000 SNPs to account for linkage disequilibrium and used the option “-se” to calculate standard errors of migration proportions. We allowed up to 15 migration edges on the tree (m ranging from 0 to 15) and generated a residual heatmap to identify populations that were not well-modeled after adding each migration edge. The percentages of variation explained by the maximum likelihood trees were also calculated. Migration edges were considered until 99.9% of the variance in ancestry between populations was explained by the model. We also ensured that the incorporated migration edges were statistically significant. Residuals from the model were visualized using the R script implemented in TreeMix. Finally, in order to provide further support for a past admixture between populations, we calculated f3 and D statistics using ADMIXTOOLS (v. 5.1) with default parameters (Patterson et al. 2012). We calculated f3 statistics using qp3Pop of the form (X; A, B) where a negative value mean implies that X is admixed with populations close to A and B. We considered negative statistics with Z-score values below 2 as significant signals of admixture.

We computed D statistics using qpDstat. D statistics of the form D (A, B, X, Y) to test the null hypothesis of the unrooted tree topology ((A, B), (X, Y)) was used. A positive value indicates that either A and X, or B and Y share more drift than expected under the null hypothesis. We quote D statistics as the Z score computed using default block jackknife parameters.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A