Scalar value imputation

MB Marcus A. Badgeley
JZ John R. Zech
LO Luke Oakden-Rayner
BG Benjamin S. Glicksberg
ML Manway Liu
WG William Gale
MM Michael V. McConnell
BP Bethany Percha
TS Thomas M. Snyder
JD Joel T. Dudley
ask Ask a question
Favorite

To train multimodal models and perform radiograph case–control matching, we needed to handle missing data in numerous PT and HP variable fields. For categorical variables, missing entries were replaced by an explicit “(Missing)” value. The only PT variable with missing data was BMI. To impute BMI, we trained linear regression models on the subset of data with available BMIs using each combination of predictor variables, possibly with imputed HP variables (see Supplementary Table 20). We used the model with all predictor sets and imputed HP variables to impute all missing BMI entries. For other continuous variables, we simply performed median imputation.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A