Data validation and quality control were integrated throughout the project. The internal validity of the data was safeguarded by incorporating data that were validated by the clinical team during routine care, comparing calculated clinical scores against the manually recorded benchmarking scores from national registry data, and by data verification checks with the original hospital. In addition, several checkpoints ensured accurate processing of the data throughout the ETL and data processing pipeline. First, patient tables, headers, and column data were checked for completeness in the ETL pipeline. Secondly, parameter mappings were checked by an intensive care clinician and were therefore independently performed by two clinicians. Next, value distribution plots were continuously generated as part of the processing pipeline. These plots show the distribution of all parameters from all hospitals that were mapped to a certain concept and easily identify aberrant mappings. For all concepts, medically impossible cutoff values were determined by the clinical domain experts. Finally, demographics and any inconsistencies in the distributions or mapping were validated with their original hospital.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.