Descriptive statistics were calculated for baseline characteristics. Continuous variables were summarised as median (interquartile range (IQR)) and categorical variables as frequency (percentage). The Chi-squared test or Fisher's exact test was used for testing hypotheses on differences in proportions between groups. The Wilcoxon rank-sum test was used for testing hypotheses on differences between groups.
We performed multivariable logistic regression to investigate associations of demographic characteristics, comorbidities (limited to those reported in ≥5% of participants), presence of pneumonia and severity of COVID-19 during acute infection with persistent symptom categories present at the time of the follow-up interview. We included all participants for whom the variables of interest were available in the final analysis, without imputing missing data. The differing denominators used indicate missing data. Odds ratios were calculated together with 95% confidence intervals.
UpSet plots were used to present the coexistence of persistent symptom categories. Two-sided p-values were reported for all statistical tests; a p-value <0.05 was considered to be statistically significant. Statistical analysis was performed using R version 3.5.1 (https://cran.r-project.org). Packages used included dplyr, lubridate, ggplots2, plotrix and UpSetR.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.