The GET_HOMOLOGUES distribution contains the plot_matrix_heatmap.sh script which generates ordered heatmaps with attached row and column dendrograms from squared tab-separated numeric matrices. These can be presence/absence PGM matrices or similarity/identity matrices, as those produced with the get_homologues -A option. Optionally, the input cgANIb matrix can be converted to a distance matrix to compute a neighbor joining tree, which makes the visualization of relationships in large ANI matrices easier. Recently added functionality includes reducing excessive redundancy in the tab-delimited ANI matrix file (-c max_identity_cut-off_value) and sub-setting the matrix with regular expressions, to focus the analysis on particular genomes extracted from the full cgANIb matrix. From version 1.0 onwards, the mean silhouette-width (Rousseeuw, 1987) goodness of clustering statistics is included to determine the optimal number of clusters automatically. The script currently depends on the R packages ape (Popescu et al., 2012), dendextend (https://cran.r-project.org/package=dendextend), factoextra (https://cran.r-project.org/package=factoextra) and gplots (https://CRAN.R-project.org/package=gplots).
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.