Statistical Analyses

AS Anne E. Sanders
JG Joel D. Greenspan
RF Roger B. Fillingim
NR Nuvan Rathnayaka
RO Richard Ohrbach
GS Gary D. Slade
ask Ask a question
Favorite

Raw values of each health measure were used to generate descriptive statistics for cases and controls of each COPC and according to the number of COPCs. All other analyses of continuous variables used z-transformed values of health measures, and the data were weighted during analysis. The goal of data transformation was to produce measures of association (eg, odds ratios [ORs], regression estimates) that could be readily compared between health measures that use different scales of measurement. The goal of weighting was to adjust for the way in which study participants were selected in OPPERA-2. This took into consideration the original sampling design for the OPPERA-1 case-control study (in which TMD cases were oversampled relative to their prevalence in the population) and to adjust for differential loss to follow-up of subjects between enrollment in OPPERA-1 and participation in OPPERA-2. Such weighting is important for this analysis in order to make valid estimates of association between any two variables (eg, health measures and headache) in a sample that was originally stratified according to a third variable (ie, presence or absence of chronic TMD in OPPERA-1).34 The analytic weights for OPPERA-2 were computed as the inverse of the sampling probability for OPPERA-1, multiplied by the inverse of loss to follow-up probability between OPPERA-1 and OPPERA-2. With the exception of univariate statistics describing the distribution of explanatory variables, all means, percentages, and measures of association were calculated using generalized estimating equations with the GENMOD procedure in SAS version 9.4 (IBM), with analytic weights and robust error variance calculation.35

The analysis first assessed the associations between each health measure and the presence or absence of each COPC using statistical methods for case-control analysis of cross-sectional data. Univariate ORs of associations between each COPC and each health measure were estimated in separate binary logistic regression models: the dependent variable was presence vs absence of the COPC, and the main explanatory variable was the standardized (using z-score transformation) value of a single health measure. The models also adjusted for study site (four categories) and subjects’ demographic characteristics: age (measured in years); gender (two categories), and race/ethnicity (five categories: white, black/African American, Asian, Hispanic, or other).

A second set of analyses examined the associations between the number of COPCs and each health measure. Separate linear regression models, one for each health measure, used the health measure z-score as the dependent variable. The number of COPCs was modeled three ways to test for the different effects of the combined number of COPCs: (1) as a categorical variable to evaluate potential nonlinear relationships with the explanatory variable, with pairwise comparisons used to test for differences between subjects with no COPCs (reference group) vs the other five possibilities (1, 2, 3, 4 or 5 COPCs); (2) as a continuous variable to reveal a potential linear relationship with the dependent variable, with a test of the null hypothesis of no linear relationship (β = 0); and (3) with all five COPCs modeled as binary predictor variables in a multivariable model, with parameter estimates tested for independent contributions of each COPC to the health status measure.

Multivariable contributions of all health measures to single COPCs were investigated using random forest models, one for each COPC. As described in detail in a previous OPPERA paper,36 random forest model methodology uses all potential explanatory variables (in this paper, all health status measures) to create decision trees predicting the dependent variable (each pain condition). The goals were to identify individual measures germane to TMD that make the greatest statistical contributions to the occurrence of each pain condition and to quantify the collective accuracy of all explanatory variables in predicting each pain condition. Random forests are nonparametric statistical models that can handle interactions and nonlinear associations without the need to pre-specify the interactions or the form of the nonlinear relationships. Due to this flexibility, random forests demonstrate excellent classification performance across a broad range of tasks. For this paper, a separate random forest model was created for each pain condition, and predictor variables for each model were all health status measures. Five steps are used to create each model36: (1) a random sample of study participants is selected for replacement; (2) a random sample of predictor variables is selected, and each one is used to partition the data and create a decision tree; (3, 4) steps 1 and 2 are repeated 1,000 times each; and (5) the estimated probability of the dependent variable is then calculated as the average of all 1,000 probabilities.

Missing values of explanatory variables were imputed using on-the-fly imputation, which is the decision tree analog of multiple imputation.37 Because random forests are nonparametric statistical models that can handle interactions and nonlinear associations without the need to pre-specify the interactions or the form of the nonlinear relationships, they demonstrate excellent classification performance across a broad range of tasks. Through a combination of the bootstrap aggregating and random subspace methods used in the construction of random forests, they achieve this classification performance without overfitting to the training dataset, thus maintaining good out-of-sample performance.38 Contributions of individual variables in the random forest models were quantified using variable importance scores, which estimate the relative contribution of each predictor to the model’s classification of true positives and true negatives. Overall classification performance of the models was quantified with area under the receiver operating characteristic curve (AUROC) and area under the precision recall curve (AUPR). In datasets with unequal numbers of cases and controls, AUPR is a better measure of classification performance than AUROC, though no single metric can adequately capture classification performance.39 However, both measures accord equal weight to false positives and false negatives, whereas the relative importance of those errors may vary according to COPC. Therefore, the Brier score was also computed,40 which provides an analog to mean squared error, as well as the proportion of variance explained for the binary prediction models. Mutual information provides sensible rankings of classifiers in scenarios (such as class imbalance) that break more commonly used measures such as precision, recall, and AUROC.41

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A