Structured data were binned by the hour, with multiple intrahour measurements of the same variable replaced by its average. Missing measurements were handled separately for training and testing sets using the last observation that carried forward the imputation. Any remaining missing values were filled in using the measurement median in the training data. Quantitative data and document vectors were then standardized using the training data such that each feature had a mean of 0 and a variance of 1.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.