The data of GSE6344, GSE40435, GSE781, TCGA_KIRC were selected as the training set, GSE53757 data was used as the test set. The glmnet package was used as the binomial LASSO model of the training set.25 The operating characteristic curve of the recipient was plotted and its AUC was calculated. The pROC software package in R was used to evaluate the diagnostic value of hub genes.26 SVM is a generalized linear classifier (generalized linear classifier) that performs binary classification of data in a supervised learning manner, which maximized the margin between different classes in a high-dimensional space.27 The e1071, kernlab and caret package were used to eliminate the recursive features of the difference genes obtained, and data calculation, finally the best gene signature was obtained. The two machine learning algorithms were used to screen the hub genes of KIRC at the same time and the same key genes were obtained by using venn package of R. Then difference analysis of key genes was performed in the test set using limma package of R. Besides, a receiver operating characteristic (ROC) curve was constructed, and AUC value was calculated to estimate the predictive value of the model both in the training set and test set.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.