For all DNA methylation analyses, we used the matrix of M-values (logit transformation of beta-values) which correspond to methylation levels. Surrogate variable analysis (SVA) was performed to add surrogate variables and rule out potential batch effects. A linear model was used for the binary variable of interest, while including age and history of TBI as covariates in the model. Performing the comparative analysis in limma (33) implemented in R, we obtained t-statistics and associated p-values for each CpG site. The point-wise p-values, were then used for the identification of differentially methylated regions (DMRs) using the combined-p-value tool (34), and all DMRs identified by the combined-p value tool are all significant. For baseline low vs. high cumulative blast symptom DNA methylation analyses, we used the same matrix of M-values as above, where for each symptom, a vector of symptom scores at day 1 is added to the design matrix as a new variable to perform linear regression in limma, with point-wise p-value and significant CpG sites done in the same manner. Moreover, for the pre-post blast symptom analysis, we used the matrix of change in M-values as in pre-post methylation analyses and dichotomized the symptom variable where for each selected/filtered symptom we compared symptom score pre-post; if the score increased, the variable was set to equal one, otherwise it was set to zero.
Furthermore, due to the potential contribution of cellular heterogeneity of blood sample specimens and its effect on DNA methylation patterns, we examined whether the variability in cell proportions may be a potential confound between methylation and our factors of interest. Therefore, we used Horvath's DNA methylation age calculator (35), a tool to estimate DNA methylation age and cell proportions for samples on the Illumina Infinium platform (36) in order to calculate the cell proportion estimates of six cell types (CD4 T-cell, CD8 T-cell, natural killer, B-cell, monocytes, and granulocytes), and compared each cell's proportions between groups in both studies (low vs. high cumulative blast exposure and acute pre- vs. post-blast exposure). None of the cell types showed different cell proportions within the low-high study, while CD4T, natural killer and B-cell showed significant differences between the pre- and post-exposure samples; and thus their proportion differences (post- minus pre-) were added as covariates in subsequent DNA methylation analyses. Specifically, for the pre- and post-blast exposure analysis, a matrix of difference M-values (post- minus pre-blast) was used in the linear model, including the additional covariates from the cell proportion estimates. Lastly, for replication of findings across independent data sets, DMRs were compared such that if the directionality of gain or loss of DNA methylation within the differentially methylated region(s) and associated CpG sites were consistent across both datasets, it was considered a replication.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.