Pearson correlations and corresponding p-values were calculated between ABSOLUTE tumor purity estimates and the gene expression estimates for the 20,531 unique genes included in the RNA-seq dataset for the TCGA BRCA cohort. Q-values were calculated by adjusting the p-values for false discovery rate using the Benjamini-Hochberg procedure.
Expression values for COL1A1 and COL1A2 were normalized by calculating the ratio of their expression to COL3A1 expression from the RNA-seq dataset for the TCGA BRCA cohort. Pearson correlations were then calculated between ABSOLUTE tumor purity and each of the COL1A1:COL3A1 and COL1A2:COL3A1 expression ratios, again correcting for multiple hypothesis testing with the Benjamini-Hochberg procedure. Non-zero expression of COL1A1, COL1A2, and COL3A1 was confirmed within the Col1:Col3 high and Col3:Col1 high classified samples by comparing their expression to two reference transcripts, GPX5 and PUM1. GPX5, primarily expressed within the epididymis88 and largely absent from breast cancer samples89, was selected as a negative control, while PUM1 was included as a positive control given its repeated identification as a housekeeping gene within breast cancer90–92 as well as other tissues141.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.