A rare variant burden analysis (RVBA) was performed on the log-transformed cytokine levels by using the Sequence Kernel Association Test (SKAT) [14, 30] in R version 3.5.2. The SKAT is a kernel-based test method that aggregates weighted individual variant-score test statistics while allowing variant-variant interactions and is extremely powerful when a genetic region has both protective and deleterious variants or many non-causal variants [14, 30, 31]. The SKAT was performed over three levels of grouping: (I) gene level, where all variants in a gene region are combined into a set (Fig. 1e.I), (II) subpathway level, where all variants in genes that belong to the corresponding subpathway are combined into a set (Fig. 1e.II), and (III) inflammatory level, where based on gene-encoded protein function genes are classified with either a pro- or anti-inflammatory phenotype and all variants from genes in either groups are combined into a set (Fig. 1e.III). All variant sets were pruned for linkage disequilibrium (LD) based on within-cohort metrics and the commonly used R2 cut-off of > 0.8, using the snpStats package in R [32]. For each region, we used the SKAT_CommonRare function with default weights to determine the effect of only common (I.SKAToC) and combined common and rare variants (II.SKATjoint), and the SKAT-O algorithm with default weights (III.SKATO) to determine the effect of only rare variants, where common and rare variant classification was based on a cohort MAF of 5% (Fig. 1f,g). The SKAT-O algorithm uses a linear combination of the SKAT and Burden Test, making it slightly more powerful than the “normal” SKAT when rare variants in a set are truly causal or influencing the phenotype in the same direction [31]. SKATO accompanying rho-values can be used to assess the contribution of SKAT versus Burden Test for significant sets, reflecting the proportion of bi- and unidirectionality of an association. In the case of rare and joint tests, output based on > 1 variant was considered, and in the case of joint tests, the presence of both rare and common variants in the set was an additional requirement. P values were Bonferroni-adjusted for each previously defined test separately, based on the number of groups tested within one level of grouping for each cytokine. For data wrangling and visualizations, we used a variety of R packages, e.g., dplyr, reshape2, ggplot2, scales, ggpubr, ggrepel, hash, ggpmisc, and devtools, all of which are freely available online [33, 34].
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.