To further study ML-CDN2’s performance in vivo, we employed the model to predict the drug response of five drugs for patients in the TCGA BC dataset for which drug response was recorded. Five drugs were tested in the GDSC study, including paclitaxel, fluorouracil, tamoxifen, doxorubicin, and docetaxel. For each of the five drugs, the patients were assigned to two groups based on the recorded drug response: Responder (patients showing a “complete response”) and Non-responder (patients showing a “partial response”, “progressive disease”, or “stable disease”). For these patients, we first calculated their pathway activity scores based on their whole-genome gene expression profiles and then measured the similarity between these BC tumors and the GDSC BC cell lines using the Pearson correlations of their pathway activity profiles. The Pearson correlations were further weighted using Eq. 2 with σ optimized from ML-CDN2. In the end, the responses of these TCGA BC patients to the five drugs were made with Eq. 5. Since the IC50 values in the GDSC study were measured using cell viability, we expected that patients in the Responder group would have a lower predicted IC50 value than patients in the Non-responder group.
Using Eq. 5, we also predicted the response of all TCGA BC patients to lapatinib, and tamoxifen, which were included in the GDSC study. Lapatinib is a tyrosine kinase inhibitor targeting HER2/EGFR receptors and is used to treat HER2-overexpressing breast cancers. Tamoxifen is a selective estrogen receptor modulator (SERM) that targets ER receptors and is used to treat ER-positive breast cancers. For lapatinib, we separated the BC patients based on their HER2 overexpression level measured by immunohistochemistry (IHC) into four groups: 0, 1+, 2+, 3+, indicating the increasing expression level of HER2. We then compared the predicted IC50 values for lapatinib among the four groups. We expected to see that groups with higher HER2 expression levels would demonstrate lower predicted IC50 values. BC patients treated with tamoxifen were divided into two groups (Negative and Positive) based on the IHC status of ER. Then the predicted IC50 values for tamoxifen were compared between the two groups. We expected that the predicted IC50 values for patients in the ER Positive group would be lower than the IC50 values from the patients in the ER Negative group.
In addition, we used Eq. 5 to predict the drug response of the EGFR and PI3K pathway inhibitors for TCGA BC patients for which there was no drug response recorded. Since these drugs target either the EGFR pathway or the PI3K pathway, we expected the expression level of the EGFR pathway genes to be strongly correlated with the predicted EGFR inhibitor response while the expression level of the PI3K pathway genes would be strongly correlated with the predicted PI3K inhibitor response. We obtain the gene lists for the EGFR and PI3K pathways from MSigDB [23]. To study the correlation between genes in a pathway and an inhibitor of the pathway, we employed multiple linear regression between the predicted IC50 value (response variable) of the inhibitor and the expression levels of the pathway genes (predictors). We obtained the p-value for each gene and corrected them for multiple comparison, using Bonferroni correction (α = 0.05).
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.