Analytic Strategy

RC Ronald Chambers
JG Jordan Greenbaum
JC Jennifer Cox
TG Terri Galvan
ask Ask a question
Favorite

Data analysis was conducted in 3 stages, using the statistical software program, R. 38 First, we examined differences in descriptive statistics (eg, means, proportions) between those who completed the CASH program checklist and those who did not, and we tested the statistical significance of differences observed. Fisher’s exact test was used for testing differences in proportions because the Pearson Chi-square test of independence has been shown to be less accurate in small samples, such as this, where the expected number of cases in a given cell may be less than 5. 39 For numerical variables, the Shapiro-Wilk test was used to test the normal distribution assumption, and the results suggested that the assumption was violated for most measures. Therefore, the Wilcoxon rank-sum test was used in place of the standard 2-sample t-test because it does not rely on the normality assumption. 40 While the Wilcoxon test measures differences in medians as opposed to means, supplemental t-tests on the mean were also conducted, and the results were consistent (analysis not shown).

Second, we ran a series of logistic and linear regressions to test the association between MSH care and CASH program outcomes. Specifically, we used logistic regression to examine the relationship between being an MSH patient and completing the program checklist (coded “1” if the checklist was completed and “0” if the participant stopped participating for 6 weeks or more and/or returned to a trafficking situation). The initial, or base, model included only MSH status, and then additional models with select control variables were constructed to address potential concerns regarding omitted variable bias. Given the relatively small sample size and associated issues of limited statistical power, all potentially confounding variables were not included in the same model. Instead, they were categorized into 3 thematic groups, with each group included in separate models. The 3 thematic groups were based on (1) healthcare status and needs, (2) measures of vulnerability and risk, and (3) demographic characteristics. Table 4 outlines the variables included in each. To further test the association between MSH care and program outcomes, we then examined how the duration and degree of MSH care was related to the completion of the checklist. As with the baseline analysis, we explored whether these associations held while controlling for key healthcare, vulnerability/risk, and demographic factors.

Control Variables.

As MSH patients participated in the CASH program longer, on average, than other participants (MSH ≈ 9 months, non-MSH ≈ 4 months; P-value <.001), we examined whether program length was a mediating factor in the relationship between MSH care and checklist completion. We then ran a series of linear regression models to further investigate the relationship between MSH care and the length of program participation. These models included the 3 thematically based control models, as well as an interaction model examining whether the association varied based on a participant’s housing situation.

Third, and finally, survival analysis models were used to examine the time to program drop-out (program incompletion), indicated by a participants’ lack of interaction with the program for a period of at least 6 weeks and/or their known return to trafficking. Using Kaplan-Meier plots, we analyzed the probability of survival (the probability of not dropping out or, in other words, persisting in the program) for each participant, and we compared the Kaplan-Meier curves for MSH patients and other program participants. Log-rank tests were used to test the statistical significance of differences between MSH patients and other program participants. We also ran Cox Proportional-Hazards models to investigate whether the observed patterns remained when control variables were included and to determine the Hazard Ratio, or the relative risk of incompletion, between MSH patients and other program participants. Hazard Ratios are commonly interpreted as relative risk, but it is important to note that relative risk focuses simply on whether or not the event occurred by the end of the study. The Hazard Ratio takes into account the timing of each event into its calculation. Thus, the 2 are not entirely interchangeable.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A