The mutational catalogues from the 21 WGS samples were analyzed for mutational signatures. Signature extraction was performed using SigProfiler (conducted using methods based on nonnegative matrix factorisation (NMF) as described in Alexandrov et al. (refs. 1,36,37)). Analyses were carried out separately for single base substitutions (SBS signatures) and indels (ID signatures). We first performed de novo mutation signature extraction using NMF-based method as described in SigProfiler.
For SBS signatures, extracted signatures were then compared to the set of mutational signatures deciphered from the COSMIC database as previously described (refs. 1,36,37). The algorithm identifies the optimal combination of known human signatures that explains the observed mutation patterns (highest cosine similarity). All signatures extracted across all samples were able to be explained by a combination of signatures from the known human signatures (cosine similarity > 0.75) and thus we did not identify any signatures that would be considered novel. We performed hierarchical clustering of the samples based on the relative contribution of identified mutation signatures using both results from the de novo extraction and after signature deconvolution.
For INDELs, given the limited understanding for these signatures, particularly in mouse, as well as the fact that there were very few mutations, we chose to perform downstream analysis using only the de novo extracted signatures.
Code for the original SigProfiler software is available: https://www.mathworks.com/matlabcentral/fileexchange/38724-sigprofiler.
For further details of this algorithm and it’s most up to date implementation we refer the reader to ref. 1 and the updated software webpage: https://pypi.org/project/sigprofiler/.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.