Up to three observations were constructed for each patient depending on their length of stay, averaging the available predictor values in the 24 h preceding 1 day, 7 days, and 14 days of IMV. This process is illustrated in Additional file 1: Fig. S2. ICU mortality was modeled as a classification problem with a decision tree, logistic regression and XGBoost algorithm to investigate the performance of both simple and complex linear and non-linear models. Ventilator and ICU-free days were treated as regression problems with a Lasso and Ridge linear model, as well as an XGBoost regressor. For every outcome and every algorithm, a single model was fit on data points from all days.
Overall model performance was evaluated using the area under the receiver operating characteristic (AUROC), average precision, calibration loss, and Brier score. A nested cross-validation was performed for hyperparameter optimization and to assess performance on the whole dataset. This approach first splits the data into five outer holdout sets with 20% of the data each. For each holdout set, the remaining 80% of the data were used to fit and optimize a model via fivefold cross-validation and a randomized search over a predefined range of hyperparameter values. Observations belonging to the same patient were always kept in the same set to avoid leakage of information. A graphical representation of the process is shown in Additional file 1: Fig. S3.
For each outer holdout set, data imputation, standardization and automated feature selection were performed independently on each train set and then applied to the test set. Missing data were imputed using median imputation for simplicity and predictors were standardized to have a mean of 0 and a standard deviation of 1. A Lasso regression was used for automatic feature selection [17], and its L1 regularization term was optimized together with the classifiers’ hyperparameters. The best-performing estimator from each inner cross-validation was then used to predict the performance on the corresponding holdout test set. The overall performance resulted from the average performance of all outer folds.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.