Investigation of the genetic diversity was carried out by comparing the Fulani and the mixed breed animals from Cameroon (n = 212) to 144 reference animals that passed quality control checks (Supplementary Table 1).
Principal components analysis (PCA) was performed using the pca function of PLINK v1.90 (Chang et al., 2015; Purcell and Chang, 2018) to provide an insight into the population structure of the cattle breeds.
Next, hierarchical clustering was performed on the genome-wide identity-by-state (IBS) pairwise distances between individuals using the SNPRelate package in R version 3.5.0 (Zheng et al., 2012; R Core Team, 2018). Subgroups of individuals were determined using a Z-score threshold of 15 based upon individual dissimilarities to define groups of individuals in the hierarchical cluster analysis. An outlier threshold of 5 was also set, this means that groups with ≤ 5 animals are considered outliers. For comparison the dendrogram was redrawn using breed and population type to determine the groups.
Population genetic structure was also evaluated using the ADMIXTURE software tool (Alexander et al., 2009) to determine the European taurine, Asian zebu Bos indicus and African taurine ancestries at the genome-wide level. Variants in high linkage disequilibrium (LD) with each other were removed prior to analysis. The LD pruning criteria applied was to remove any SNP that had an r-squared >0.2 with another SNP within a 200-SNP window; for a sliding window of 10 SNPs at a time, which resulted in 55,361 markers (out of 500,929 markers) left for analysis. A 5-step expectation–maximization (EM) algorithm was used. In addition, 10-fold cross validation was performed with 200 bootstrap resampling runs to estimate the standard errors for each cluster level (K = 2–12). The output was plotted using the pophelper package for R (Francis, 2017). The optimal number of clusters was determined from the cross-validation plot.
To provide further insights into the Fulani and Mixed compositions related to the other groups, the ADMIXTURE analysis was also run in a supervised mode, with European taurine, African taurine, and Asian zebu (Bos indicus) and Admixed groups pre-specified.
To estimate the degree of genetic differentiation in the cattle populations, fixation indices (Fst) were calculated using the Weir and Hill (2002) relative beta estimator method as implemented by the snpgdsFst function in the SNPRelate package (Weir and Hill, 2002; Zheng et al., 2012; Buckleton et al., 2016). In addition, the het function of PLINK v1.90 (Purcell and Chang, 2018) was used to calculate the inbreeding coefficient estimate, F, for each individual.
Differences in the inbreeding coefficient may also be due to the ascertainment bias of the SNP array. To assess this ascertainment bias, linkage disequilibrium, r2 and D′, was calculated between pairs of SNPs within a 50 SNP window using PLINK v1.90 (Purcell and Chang, 2018) for each breed. The r2 was then plotted against the window size (bp distance between first and last SNP within the window) and a second degree polynomial fitted to show the trend in r2 for each breed. The average r2 and D′ also calculated for each breed.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.