To divide the genome into haplotype blocks (haploblocks) based on linkage disequilibrium, Haploview tools [16] were applied to the set of 2579 SNPs. Chromosomal regions with strong linkage were identified using default Haploview parameters (confidence interval for LD [0.7, 0.98]). Each haploblock was considered as the set of SNPs located within a given haploblock. We analysed haploblock enrichment for SNPs associated with trait or variable by applying the logic of gene-set enrichment analysis implemented in the FGSEA method [17], which takes as input data the list of all SNPs ranked by increasing GWAS p-values and the list of haploblocks. The method returns an enrichment score and FDR corrected p-value [41] for each haploblock. We performed FGSEA analysis for each trait (phenotype and bioclimatic variable), and haploblocks significantly enriched for associated SNPs were defined as those having positive enrichment scores and significantly low FDR corrected p-values (<0.05). The outcome of this analysis was that each phenotype or bioclimatic variable was characterized by a set of haploblocks significantly enriched with associated SNPs. To obtain groups of phenotypes and variables sharing sets of enriched haploblocks, we applied bi-clustering on the matrix of pairwise similarities between traits. To estimate the degree of overlap between haploblocks enriched for SNPs associated with different traits, we calculated the haploblock simalarity score as a sum of common haploblocks (i.e., haploblocks enriched for SNPs associated with both traits) divided by the sum of all haploblocks significantly enriched for SNPs associated with these two traits.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.