The MDP

Jing Li; Gang Yu; Wen Ding; Jian Huang; Zheming Li; Zhu Zhu; Dejian Wang; Jie Zhang; Jing Wang; Jianwei Yin

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

The MDP

JL Jing Li

GY Gang Yu

WD Wen Ding

JH Jian Huang

ZL Zheming Li

ZZ Zhu Zhu

DW Dejian Wang

JZ Jie Zhang

JW Jing Wang

JY Jianwei Yin

This method is extracted from research article: Transl Pediatr, Jul 2021

Data governance system of the National Clinical Research Center for Child Health in China

DOI: 10.21037/tp-21-272

Ask a question

Favorite

As the 1-stop platform for clinical researchers, the MDP provides medical data of compliance, multidimension and high quality, as well as research tools of effectiveness, convenience, and visualization. The MDP primarily includes 3 parts; these are the data acquisition layer, middle platform, and application layer.

The data acquisition layer acquires medical data required by research projects. The obtained data contain various categories, such as clinical data from health information technology systems (e.g., electronic medical records, hospital information systems, laboratory information systems, and picture archiving and communication systems), omics data from researchers (e.g., genomics, metabonomics, proteomics, immunomics, and ultrasomics), and data from other sources (e.g., biobank, wearables, electronic health records, epidemiology, climate, and environment).

Different techniques [e.g., database batch push, application programming interface (API) transmission, and uploads in files and tables] can be adopted in data acquisition according to the actual conditions. The data are gathered and stored as “raw data” in the data acquisition layer. Raw data are processed through a privacy protection module that deletes patients’ unnecessary private information of each data entry and encrypts what is left.

After preprocessing, data are defined as desensitized data and duplicated in the database. Each copy of desensitized data is supervised by the system. It can be recovered from the other copy in case of data loss.

As the core layer of the MDP, the middle platform manages data quality, research database, and system configuration.

The data quality in the data acquisition layer is not good enough for clinical research in terms of completeness, accuracy, and consistency. A closed-loop mechanism is designed for data quality improvement. The module of data quality management can discover and solve most problems of data quality. The workflow of data quality improvement is shown in Figure 2.

Workflow of data quality management. AI, artificial intelligence.

Problem discovery, problem locating, problem solving, and solution verifying together constitute the process group of data quality management. Most processes of data quality management are triggered by the system automatically according to the established criteria. In the case that a data quality problem is discovered by researchers, the process can also be initiated manually. Once the process is launched, the metadata management module can locate the root cause of the problem and visualize it on the lineage diagrams. Given different scenarios, problems are addressed by the system or platform administrators. It is required to verify the solutions by the system administrators or researchers. All the processes are recorded in the system log and shown on the problem reports.

The module of data quality management works continuously, so that the completeness, accuracy, and consistency of the medical data on the data platform can be improved constantly.

High-quality medical data are stored in the RDR for research projects. The workflow of data processing from raw data to the RDR is shown in Figure 3. The data directory is created automatically by the system. Databases and knowledge graphs for specific diseases can also be generated. Metadata (including, but not limited to, data category, quantity, data source, and update time) of the medical data are analyzed statistically and visualized by the business intelligence module.

Dataflow of the Medical Data Governance System. AI, artificial intelligence; RDR, research data repository.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol