Multivariate survival prediction was performed using a penalized Cox Proportional Hazards Model (pCPHM) with a ridge (L2) penalty using the R library “penalized” [10]. Thirty iterations of 3-fold cross-validation were performed for each method. To reduce the noise in the comparison the same CV splits were used for each of the methods. Additionally, the cross-validation partitions were constrained to preserve the ratio of events to non-events within each fold. To match the number of features for GSVA and GRAPE, the set of GE features was filtered to include only the top 4500 features, ranked according to standard deviation. For each method, a two-step feature selection procedure was used within the training set. In the first step, univariate associations with survival were calculated for each feature using the function “gt” within the R library “globaltest” [11]. In the second step, the top N features having smallest p-value were chosen for the final model. A line search over the interval [5200] was used to identify the value for N for which the cross-validated likelihood is maximized. The function “optL2” in the “penalized” package was used to select the value of the ridge parameter lambda in the pCPHM, and to evaluate the internal cross-validated likelihood of the various sized models. All features were standardized in the pCPHM.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.