Selection of Molecular Subtypes Based on Consensus Clustering

GH Guichuan Huang
JZ Jing Zhang
LG Ling Gong
DL Daishun Liu
XW Xin Wang
YC Yi Chen
SG Shuliang Guo
request Request a Protocol
ask Ask a question
Favorite

Based on the intersected methylation sites that were significant both in univariate and multivariate Cox proportional hazards regression model analysis, we used the ConsensusClusterPlus R package to obtain consistent clustering to identify the LUSC subtypes [24]. In the present study, 80% of the LUSC samples were sampled 100 times by using the resampling program. The similarity distance between samples using the Euclidean distance was calculated, and K-means was used as the clustering algorithm to obtain the reliable and stable subgroup classification.

The optimal number of clusters was identified using the cumulative distribution function (CDF) and the delta area plot. The criteria for determining the optimal number of clusters should be that the consistency of the cluster was relatively high, the coefficient of variation was relatively low, and no significant rise in the area under the CDF curve. The number of categories was selected with no appreciable rise in the area under the CDF curve. The corresponding heatmap of the consensus clustering was constructed using the R package of pheatmap.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A