To identify genomic regions affected by selection, we applied three types of selective sweep analyses, the fixation index (FST), the pooled heterozygosity (Hp) and the cross-population composite likelihood ratio test (XP-CLR) approaches. FST in 25-kb non-overlapped sliding windows between the two breeds was calculated using the “fst-sliding.pl” module in Popoolation2 (Kofler et al., 2011), according to the Weir and Cockerham method (Weir and Cockerham, 1984). In the case that we had two populations and their different pooling strategies, the FST estimator is:
where ni is the sample size and is the sample allele frequency of the populations (Bhatia et al., 2013). Hp and negative Z-transformed Hp (−ZHp) of the Mame Shiba Inu were calculated using a custom python3 script from 25-kb non-overlapped sliding windows using the following formulas:
Z transformation allowed us to compare the two breeds at the same level, since ZHp values indicate the number of standard deviations by which the Hp value deviates from the mean (Rubin et al., 2010). XP-CLR statistics (Chen et al., 2010) between Mame Shiba Inu and Shiba Inu was calculated using xpclr5 in 25-kb non-overlapped windows. Windows containing < 10 SNPs were discarded to prevent spurious signals. Windows that were all in the top 1% of the FST, -ZHp and normalized XP-CLR score distributions were considered to be candidate selective sweep regions. Genes overlapping with these regions were identified using the ENSEMBL dog genome gene annotations (CanFam 3.1).
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.