Statistical analysis

CY Chunchun Yuan
JW Jing Wang
WZ Weiqiang Zhang
HY Honggang Yi
BS Bing Shu
CL Chenguang Li
QL Qianqian Liang
DL De Liang
BC Bolai Chen
XX Xingwen Xie
XL Xinchao Lin
XW Xu Wei
HW Hui Wang
PC Peizhan Chen
CH Chen Huang
HX Hao Xu
YS Yueli Sun
YZ Yongjian Zhao
QS Qi Shi
DT Dezhi Tang
YW Yongjun Wang
request Request a Protocol
ask Ask a question
Favorite

We used a cross-sectional study design to explore the population-level association between the explanatory covariates and BMD outcomes. The continuous variables were summarized as mean (standard deviation) or median (inter-quartile range) depending on the normativity of the distribution. The categorical variables were listed as numbers and percentages. The box-cox transformation was used to reduce the skewness of the outcomes. The univariate analysis was performed by examining the differences of the BMI or genetic groups using the χ 2 test for categorical variables and the one-way ANOVA or Kruskal-Wallis test for continuous variables. Spearman’s rank correlation was used to assess the correlation among serum and BMD measures.

Generalized varying coefficient models (GVCM) with penalized cubic regression splines were employed to characterize the nonlinear relationships of outcomes and explanatory covariates, as well as the interactions of explanatory covariates to account for the complexity. The relationship between each covariate and the outcome is allowed to vary as a potentially nonlinear, smooth function of an effect modifying covariate estimated from the data in the GVCM. A stepwise forward method was used to select the variables in the final model for multivariate analysis. Bayesian 95% confidence intervals (CI) were estimated to present the association of each covariate with the outcome.

All results were calculated and adjusted for the identified potential confounders as fixed effects in the GVCMs, including age (in years, continuous), sex (men or women), educational status, smoking status, drinking status, dietary types, family income levels, seasons, regions, and mean annual hours of sunshine (in hours, continuous). Missing data patterns were visualized and imputed using the R multivariate imputation by the chained equation (MICE) package in R (the number of imputations is set to 5). The GVCM was fitted to each imputed dataset and the results were combined across the 5 datasets using the Rubin’s method, which computes imputation-adjusted variances and effects estimations.

All p values presented were two-sided, with statistical significance determined by a family-wise type I error less than 0.05. The Bonferroni’s method was used for multiple comparisons following multiple imputations. All the analyses were performed with the use of R software (R Foundation for Statistical Computing, version 4.0.5) and Stata 15.0 software (STATA Corp, College Station, TX). Additional details regarding the statistical analysis are provided in the Methods section of the Supplementary Materials .

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A