We downloaded structured data dictionaries (in CSV and PDF) from NACC, ADNI and NIH CDE Repository on November 23, 2021. The CSV data dictionaries provided by NACC contain data elements in the UDS, Neuropathology (NP) data set, and genetic data. The PDF data dictionaries provided by NACC contain data elements in the imaging and biomarker data sets. Figure 1 shows an example data element in NACC’s imaging data dictionary. We used the open-source pdftotext utility (part of the Xpdf software suite [9]) to convert PDF data dictionaries to plain text files, which were further parsed to extract attributes of data elements and store them in CSV. Figure 2 shows two examples of data elements in NIH CDE Repository.
An example data element from NACC’s imaging data dictionary in PDF. The form name of this data element is “Imaging”. The short descriptor of this data element is “Left insula gray matter volume (cc)”
Two examples of common data elements in NIH CDE Repository
For data element mapping, we leverage “Form” and “Short descriptor” of NACC data elements, “CRF NAME” and “TEXT” of ADNI data elements, as well as “Name” and “Question Texts” of data elements in NIH CDE Repository.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.