Study design and population

RN Ramesh Nadarajah
JW Jianhua Wu
DH David Hogg
KR Keerthenan Raveendra
YN Yoko M Nakao
KN Kazuhiro Nakao
RA Ronen Arbel
MH Moti Haim
DZ Doron Zahger
JP John Parry
CB Chris Bates
CC Campbel Cowan
CG Chris P Gale
ask Ask a question
Favorite

In this population-based study, we used primary care EHRs from the UK Clinical Practice Research Datalink (CPRD)-GOLD dataset. CPRD is one of the largest databases of longitudinal medical records from primary care worldwide and contains anonymised patient data from approximately 7% of the UK population.8 CPRD-GOLD represents the UK population in terms of age, sex and ethnicity,8 and has been used to develop algorithms for predicting AF.11 Data collection happens as part of routine clinical care in participating practices and patients are included in the primary care dataset from their first until their last contact with a participating practice.8 Diagnostic coding for AF in CPRD has been shown to be consistent and valid, with a positive predictive value (PPV) of 98%.12

All individuals in the CPRD dataset were linked to Hospital Episode Statistics (HES) Admitted Patient Care (APC) records to obtain comprehensive coverage of AF cases diagnosed in secondary care. We included all adults registered at practices within CPRD who were ≥30 years of age at entry with no history of AF from either data source and at least 1-year follow-up between 2 January 1998 and 30 November 2018. Individuals were censored to a diagnosis of AF (or atrial flutter (AFl), since it has similar thromboembolic risk and anticoagulation guidelines),7 withdrawal from CPRD or 6 months, whichever came first. Diagnoses of AF or AFl in primary care were identified using Read codes in CPRD and in secondary care with the 10th revision of the International Statistical Classification of Diseases and Related Health Problems codes in HES-APC (online supplemental table 3). Individuals were randomly split 4:1 to establish a training dataset (80%) and a testing dataset (20%) using the Mersenne twister pseudorandom number generator.

We followed the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis reporting guideline and the CODE-EHR best-practice framework for using structured electronic healthcare records in clinical research.13 14

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A