Statistical Analysis

Nicole Kravitz-Wirtz

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Statistical Analysis

NK Nicole Kravitz-Wirtz

This method is extracted from research article: J Health Soc Behav, Oct 2016

Cumulative Effects of Growing Up in Separate and Unequal Neighborhoods on Racial Disparities in Self-rated Health in Early Adulthood

DOI: 10.1177/0022146516671568

Request a Protocol

Ask a question

Favorite

Initial analyses investigated the basic associations among race, neighborhood disadvantage, and self-rated health using conventional logistic regression models. The final analysis specified a marginal structural logistic regression model in which the parameters were estimated using Inverse Probability of Treatment (IPT) weights. The rationale behind this approach has to do with the failure of conventional regression models to take proper account of time-varying covariates that might be simultaneously confounders for the effects of future exposures and mediators for the effects of past exposures on future outcomes (Robins, Hernan, and Brumback 2000).

For instance, a family’s income directly influences both the type of neighborhood they can afford to live in and the health of its members. If a researcher does not control for family income when modeling the effects of neighborhood characteristics on health, she may overstate (or spuriously induce) the neighborhood–health relationship. Conversely, if she controls for family income, she removes from the final estimate the indirect effects of residential conditions that operate on health through family income (Kain 2004).

In longitudinal studies, this dilemma is compounded across time: family income, measured at any one wave, is a function of past income and past neighborhood conditions, as well as a determinant of future income, future neighborhood conditions, and future health. Controlling for family income and the other time-varying covariates all 18 times they are measured in this study would not only be cumbersome, but would wipe out the various indirect pathways through which neighborhood effects can transpire. Not controlling for such factors, however, could amount to neighborhood selection bias.

Marginal structural models using the IPT estimators were a means of incorporating the indirect effects of neighborhood disadvantage on health, while still adjusting for possible confounding (i.e., differential selection into and out of certain neighborhoods) by time-varying individual- and household-level covariates. This method proceeded in two steps. First, prior-year exposure to neighborhood disadvantage, time-invariant covariates, and both prior-year and concurrent time-varying covariates were used to predict respondents’ probability of exposure to each quintile of neighborhood disadvantage in each year. Respondents were then assigned a series of treatment weights based on the inverse of the predicted probability corresponding to the quintile of neighborhood disadvantage in which respondents were in fact observed (as opposed to the other four levels of disadvantage) in each year.

To obtain a more precise neighborhood effect estimate in the final marginal structural model, these weights were stabilized by multiplying each one by the same predicted probability as above, except time-varying covariates were excluded from the initial prediction model and the predicted probability was not inverted. The resulting series of stabilized treatment weights for each respondent at each age were then multiplied together to produce a single, summary weight reflecting the probability of exposure to that respondent’s actual sequence of neighborhood disadvantage quintiles throughout childhood and adolescence.

In the second step, a marginal structural logistic regression model estimating the effects of cumulative exposure to neighborhood disadvantage throughout childhood and adolescence on the probability of self-rated fair/poor health in early adulthood could be fit to a weighted pseudo-population generated using the stabilized treatment weights just described. Controlling for time-varying covariates was no longer necessary since their confounding effects on neighborhood selection had been accounted for through the weighting process. However, because time-invariant covariates were included in the regression models used to calculate both the numerator and denominator of the stabilized weights, they needed to still be controlled in the final model. Huber-White robust standard errors were used to account for the clustering of respondents within PSID households (Robins, Rotnitzky, and Scharfstein 2000; Wodtke et al. 2011).

To be unbiased and consistent, parameters estimated using IPT weights required several key assumptions, including no unmeasured confounding, no model misspecification, and positivity (Cole and Hernan 2008; Robins, Hernan, et al. 2000). Although this study adjusted for a wide array of among the most common predictors of both neighborhood selection and self-rated health, there was still the possibility that factors not measured in the PSID could upwardly bias the neighborhood effect estimate. Incorrectly specified models for selection into neighborhoods with varying levels of disadvantage could also bias the neighborhood effect estimate. However, other scholars utilizing the PSID and similar covariates have demonstrated that such estimates are relatively robust to a variety of model specifications (Wodtke 2013).

Finally, and perhaps most challenging in light of persistent racial-spatial inequalities, is the assumption of positivity, meaning that both nonwhites and whites must have a non-zero probability of exposure to every quintile of neighborhood disadvantage across all levels and combinations of measured covariates. Violations of the positivity assumption can result in weights that are unstable and sensitive to the presence of rare combinations of covariates. However, the IPT weights for this study displayed means reasonably close to one and small standard deviations, suggesting that the presence of rare combinations of covariates did not exert overt influence on the results. Although an argument in favor of weight truncation at the 1st and 99th or 5th and 95th percentiles could be justified on the basis of better behaved weights (mean = ‘1’, small range), substantive results based on these alternate specifications remain unchanged (see Appendix C in the online supplemental material, available at http://jhsb.sagepub.com/supplemental; Cole and Hernan 2008).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol