Statistical analyses

MJ Melker S. Johansson
AH Andreas Holtermann
JM Jacob L. Marott
EP Eva Prescott
PS Peter Schnohr
MK Mette Korshøj
KS Karen Søgaard
request Request a Protocol
ask Ask a question
Favorite

We used frequencies with percentages or medians with the first and third quartile (Q1-Q3) to describe the characteristics of the study population. Medians were used due to skewed distributions of some of the continuous variables.

We compared the characteristics of the study participants who did not fulfil the inclusion criteria with those who fulfilled using Mann-Whitney U test, Pearson’s Chi-squared test (p-values <0.05 were considered to indicate differences between groups) and by assessing 95% confidence intervals (CI) of proportions and medians. The CIs were calculated with the Wilson’s score method [37] and the normal approximation method for proportions and medians, respectively.

The sample space of compositional data (i.e., the simplex) has a geometry that is incompatible with standard statistical methods. To make these methods applicable, we transformed the physical activity composition with the isometric log-ratio (ilr) transformation [25,38]. This resulted in a set of ilr-coordinates that represent the physical activity composition in a sample space (i.e., the real coordinate space) that allows the use of standard statistical methods [26]. Specifically, we constructed pivot ilr-coordinates, in which the first coordinate (ilr1) represents the first part of the composition relative to the geometric mean of the remaining parts [38].

We investigated how the physical activity composition (expressed as ilr-coordinates) were associated with each outcome using linear regression models (i.e., crude and adjusted analyses). The modelling process was conducted through three steps:

Firstly, we fitted multiple linear regression models with the ilr-coordinates representing the physical activity composition and potential confounders as covariates (i.e., only in the adjusted analyses) and SBP, WC, and LDL-C as outcome. Observations with missing values in the covariates were not included in the adjusted models (n = 69). The model assumptions were checked by plotting standardised residuals against a) continuous covariates (i.e., assumption of linearity) and b) fitted values (i.e., assumption of homogeneous variance), and by quantile-quantile (Q-Q) plots of the residuals (i.e., assumption of normally distributed residuals).

Secondly, because the model estimates of the ilr-coordinates are not directly interpretable (due to the ilr-transformation), we theoretically reallocated time between sedentary behaviour and 1) walking, and 2) HIPA to quantify the measure of association in an understandable way [26]. Specifically, for work and leisure, respectively, we reallocated the geometric mean composition (i.e., reference composition) according to time reallocation 1 and 2. That is, the reallocations were made pairwise (a.k.a. one-to-one reallocations) during work and during leisure, respectively; all remaining physical activity types were kept constant. For example, if 10 minutes were reallocated from sedentary behaviour to walking in a theoretical composition consisting of 315 minutes sedentary behaviour, 100 minutes standing, 60 minutes walking and 5 minutes HIPA during work, it would result in 305 minutes sedentary behaviour and 70 minutes walking during work, with the duration of the remaining physical activity types, in both domains, kept constant.

Because the geometric mean of walking and HIPA was lower during work than leisure, we could not reallocate the same absolute duration of time during work and leisure. For time reallocation 1, we therefore reallocated 10 to 30 minutes between sedentary behaviour and walking during work, and 10 to 50 min during leisure time, in 10-minute portions. For time reallocation 2, we reallocated 1 to 2 minutes between sedentary behaviour and HIPA during work, and 1 to 10 minutes during leisure, in 1- and 2-minute portions.

Thirdly, we used the fitted values from the linear regression models to estimate each outcome given the reference- and reallocated compositions. Then, we calculated the difference in outcome by subtracting the estimated outcome of the reference-composition from the estimated outcome of each reallocated composition [26,27].

To investigate the influence of excluding individuals taking antihypertensives, diuretics, or cholesterol lowering drugs, we conducted sensitivity analyses including 1) all study participants regardless of medication use, and 2) limited to those with the medications use.

We used RStudio (version 1.3.1093) [39] running R (version 4.0.3) [40] for all analyses, and, specifically, the packages compositions [41] and robCompositions [42] for the analyses involving CoDA.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A