Structured Data Preprocessing

SL Sidney Le
AA Angier Allen
JC Jacob Calvert
PP Paul M. Palevsky
GB Gregory Braden
SP Sharad Patel
EP Emily Pellegrini
AG Abigail Green-Saxena
JH Jana Hoffman
RD Ritankar Das
ask Ask a question
Favorite

Structured data were binned by the hour, with multiple intrahour measurements of the same variable replaced by its average. Missing measurements were handled separately for training and testing sets using the last observation that carried forward the imputation. Any remaining missing values were filled in using the measurement median in the training data. Quantitative data and document vectors were then standardized using the training data such that each feature had a mean of 0 and a variance of 1.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A