request Request a Protocol
ask Ask a question
Favorite

We downloaded the METABRIC (Molecular Taxonomy of Breast Cancer International Consortium) breast cancer datasets (Pereira et al., 2016) from the cBioPortal database1, including clinical information, gene expression, somatic mutation, and CNV data. The expression profile was processed as log intensity level of Illumina Human v3 microarray. We also downloaded the version 19 gene annotation file from the GENCODE database2. Then, we randomly divided the TNBC cases into the training dataset and the testing dataset with the same size of living and deceased overall survival (OS) status in each dataset. The training dataset consisting of 150 TNBC samples was used to identify the prognostic signature and build the prognostic risk model; the testing dataset consisting of 149 TNBC patients was used to validate the prognostic model. Clinical characteristics for the training, testing, and METABRIC datasets are summarized in Table 1. The Cancer Genome Atlas (TCGA3), Shanghai TNBC data4 (Jiang et al., 2019; Goldman et al., 2020), and three additional independent datasets of GSE21653, GSE31448, and GSE25066 from the GEO database5,6,7 were used to validate the performance of the prognostic risk model (Hatzis et al., 2011; Sabatier et al., 2011a,b). For TCGA and Shanghai RNAseq datasets, we downloaded or processed the gene-level transcription estimates in log2(x + 1) transformed RSEM normalized count. For the other three GEO microarray datasets, they were processed using the robust multichip average (RMA) algorithm for background adjustment (Irizarry et al., 2003a,b), and the Affymetrix GeneChip probe-level data were log2 transformed. The platform information for Affymetrix Human Genome U133 Plus 2.0 Array was downloaded from the Affymetrix website8. Gene expression data from the Affymetrix-based expression profiling were obtained by repurposing microarray probes based on the platform information and the gene annotation file from the GENCODE database (release 19, see text footnote 2).

Clinical information for triple-negative breast cancer (TNBC) patient datasets used in this study.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

post Post a Question
0 Q&A