Both the GWAS (.mlma file) and the cis-eQTL mapping summary data (.txt file) were used to perform the SMR analysis using the SMR 1.03 software [38]. Cis-eQTLs with nominal P-values of more than 5 × 10–8, and/or with differences in allele frequency between the populations of the GWAS and eQTL summary data larger than 0.2 were excluded according to Zhu et al38. When using SMR, specifying a threshold to remove SNPs with discrepant allele frequencies between data sets is required. That is, the SNPs with allele frequency differences between any pairwise data sets (the cis-eQTL and the GWAS summary data) large than the specified threshold (default value = 0.2) will be excluded to filter possible false positives due to allele frequency differences between the two studied populations. MR uses a cis-eQTL as a variable to estimate and test for the causative effect of an exposure variable (gene expression levels) on an outcome (presence of a specific type of PTB-associated lesion). The effect of gene expression on a specific disease outcome would then be explained by the effect of the cis-eQTL on both disease outcome and gene expression: where z is the SNP, x a gene’s expression level, y the phenotype, and are the least-square estimates of y and x on z, respectively, and is interpreted as the effect size of x on y free of confounding from non-genetic factors [56]. The statistic that tests for pleiotropic association (TSMR) would be then calculated as follows:
where zzy and zzx being the statistics of the GWAS and eQTL analyses, respectively. P-values were corrected with the BH method and filtered by FDR ≤ 0.05 using R [53]. Finally, to correct for linkage disequilibrium (LD), an r2 threshold (default value r2 > 0.9) was used to remove SNPs in very strong LD with the top associated cis-eQTLs.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.