Identification of eQTL hotspots in each of the six eQTL mapping datasets entailed two initial steps. First, we used a chi-square goodness of fit test, assessing each dataset separately, to determine if trans eQTL were overrepresented in any of the 2cM intervals distributed along each chromosome arm. Second, because the chi-square test only indicates whether any interval on the chromosome arm had an observed number of eQTL that deviated from expected number of eQTL, we calculated the percent contribution of each interval to the overall chi-square value. Percent contribution was calculated for each interval as
Intervals with a percent contribution value greater than 10% were considered potential hotspots. We chose 10% as a permissive threshold for detecting hotspots based on a visual assessment of percent contribution values as they varied across the chromosome arm. In most cases, our approach highlighted one interval per chromosome arm with enrichment for trans eQTL.
We took a three-step approach to further characterize trans eQTL hotspots and to identify potential regulatory candidate genes responsible for hotspots. Each step was carried out independently for each hotspot. First, we created a composite variable using PCA to summarize expression variation for all genes with a trans eQTL in that hotspot. We then used QTL mapping in the same manner as described above to isolate QTL for this composite variable (PC1), and if a peak was detected that overlapped the location of the hotspot, the hotspot was retained in subsequent steps.
Second, we repeated QTL mapping of the composite variable, but now iteratively tested whether each gene that fell within the hotspot interval had a mediating effect on the composite QTL peak by including the expression level of each gene as a covariate in the QTL analysis. If adding a gene expression covariate eliminated the composite variable QTL (or dramatically reduces the LOD score at the peak), the gene was a candidate mediating gene of the trans eQTL hotspot. The degree to which the LOD score was reduced by potential mediating genes varied across all the hotspots, making it difficult to establish a consistent cutoff value. We therefore assessed sets of mediating genes separately for each hotspot, and identified candidates that had the most pronounced mediating effect at the composite peak position. If several genes had similar mediating effects on the composite peak, we included them as candidates. This permissive approach produced a list of genes that should be considered for future follow up studies to further distinguish between true mediating genes versus those that have correlated expression with mediating genes. Because the 2cM intervals used to initially identify hotspots were created arbitrarily, we used the boundaries of the trans eQTL peak intervals to define the genomic region associated with the hotspot by using the most minimum and most maximum interval boundary for all trans eQTL in each hotspot. All the genes that fell within these expanded regions were included in this second step except for those genes that did not appear in our gene expression dataset. Some genes with very low levels of expression were removed from our expression datasets during filtering and so were not included in this analysis (see above). Covariate data used in each analysis corresponded to the dataset in which the hotspot was identified. For example, gene expression in gut tissue under copper conditions was used as covariate data to account for a composite variable derived for a hotspot detected in the Gut-Copper dataset.
Third, we further examined potential mediating genes by testing the correlation between estimated founder haplotype effects at any cis eQTL that happened to be associated with the potential mediating gene that fell within the hotspot interval. As in the second step, we used the cis eQTL that corresponded to the dataset in which the trans eQTL hotspot was detected.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.