Silhouette scores

MT Marie Trussart
CT Charis E Teh
TT Tania Tan
LL Lawrence Leong
DG Daniel HD Gray
TS Terence P Speed
request Request a Protocol
ask Ask a question
Favorite

To assess the extent to which the data are grouped based on the batch effects as opposed to biological signals, we computed batch and biology Silhouette scores. Given a partitioning of all cells into groups, if ai denote the average Euclidean distance of the protein expression between the cell i and all other cells in the group to which cell i is assigned, and bi is the minimum of the average distance between the cell i and any cells in other groups not containing cell i, then the silhouette coefficient of cell i is calculated as

The average of the silhouette values across cells using a particular grouping is called the silhouette score for that grouping. Silhouette score ranges from −1 to +one where positive values (bi is high and ai is low) indicate that cells are well matched to their own group. In this way, we computed the silhouette score sbatch based on the batches as groups and the silhouette score sbiology based on the grouping of the cells by subpopulation (i.e. clusters).

sbiology is used to quantify the cell-to-cell variation within cell subpopulations compared to other subpopulations. Negative values mean that the data might be mis-clustered as it is more similar to a neighbouring cluster. For example, if two different biological relevant clusters would be merged into a single cluster, sbiology will reflect this merging with a lower sbiology value.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

post Post a Question
0 Q&A