Beta-binomial filter

AC Alex Cagan
AB Adrian Baez-Ortega
NB Natalia Brzozowska
FA Federico Abascal
TC Tim H. H. Coorens
MS Mathijs A. Sanders
AL Andrew R. J. Lawson
LH Luke M. R. Harvey
SB Shriram Bhosle
DJ David Jones
RA Raul E. Alcantara
TB Timothy M. Butler
YH Yvette Hooks
KR Kirsty Roberts
EA Elizabeth Anderson
SL Sharna Lunn
EF Edmund Flach
SS Simon Spiro
IJ Inez Januszczak
EW Ethan Wrigglesworth
HJ Hannah Jenkins
TD Tilly Dallas
NM Nic Masters
MP Matthew W. Perkins
RD Robert Deaville
MD Megan Druce
RB Ruzhica Bogeska
MM Michael D. Milsom
BN Björn Neumann
FG Frank Gorman
FC Fernando Constantino-Casas
LP Laura Peachey
DB Diana Bochynska
ES Ewan St. John Smith
MG Moritz Gerstung
PC Peter J. Campbell
EM Elizabeth P. Murchison
MS Michael R. Stratton
IM Iñigo Martincorena
request Request a Protocol
ask Ask a question
Favorite

For each crypt, an artefact filter based on the beta-binomial distribution was applied, which exploits read count information in other crypts from the same individual. More specifically, for each sample, we fitted a beta-binomial distribution to the variant allele counts and sequencing depths of somatic variants across samples from the same individual. The beta-binomial distribution was used to determine whether read support for a mutation varies across samples from an individual, as expected for genuine somatic mutations but not for artefacts. Artefacts tend to be randomly distributed across samples and can be modelled as drawn from a binomial or a lowly overdispersed beta-binomial distribution. True somatic variants will be present at a high VAF in some samples, but absent in others, and are hence best captured by a highly overdispersed beta-binomial. For each variant site, the maximum likelihood estimate of the overdispersion factor (ρ) was calculated using a grid-based method, with values ranging between 10−6 and 10−0.05. Variants with ρ > 0.3 were considered to be artefactual and discarded. The code for this filter is based on the Shearwater variant caller62. We found this to be one of the most effective filters against spurious calls (Supplementary Fig. 1b).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A