2.1. Data source

TL Tiffany H. Leung
AE Aya El Helali
XW Xiaofei Wang
JH James C. Ho
HP Herbert Pang
ask Ask a question
Favorite

This population‐based study was conducted using data from the SEER program in the United States. The SEER database is collected, coordinated, and deidentified by the National Cancer Institute from multiple cancer registries. Institutional review board approval and a need for informed consent were not required for our study, given that the data provided in SEER had been deidentified and are available for public use. The details of our study are reported according to the Strengthening the Reporting of Observational Studies in Epidemiology Reporting Guidelines.

Two datasets, the discovery dataset and the validation dataset, were used in this study to help improve the reproducibility of our findings. The discovery dataset consisted of eight cancer registries from 1990 to 2019. The validation dataset consisted of nine cancer registries from 2000 to 2019. The extracted variables included patient ID, sex, age, race, and ethnicity, year at diagnosis, types of cancer, sequence number of PCs, survival months, and status.

Patients with curated prostate, colorectal, lung, and female‐only breast PC data were included. Those without information on survival were excluded. We defined SPC as the occurrence of PCs at least 2 months after the first PC record. 9 , 10 Therefore, PC patients who had experienced neither an SPC event nor death by the last follow‐up date were censored on that date. Patients were categorized into three groups according to their year at diagnosis: (1) 1990–1999, (2) 2000–2009, and (3) 2010–2019, to better understand the SPC trends. The patients in the validation dataset were separated into the latter two groups.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A