Rules for determining Class 1 positive variants (by each genome version)

Wendell Jones; Binsheng Gong; Natalia Novoradovskaya; Dan Li; Rebecca Kusko; Todd A. Richmond; Donald J. Johann, Jr; Halil Bisgin; Sayed Mohammad Ebrahim Sahraeian; Pierre R. Bushel; Mehdi Pirooznia; Katherine Wilkins; Marco Chierici; Wenjun Bao; Lee Scott Basehore; Anne Bergstrom Lucas; Daniel Burgess; Daniel J. Butler; Simon Cawley; Chia-Jung Chang; Guangchun Chen; Tao Chen; Yun-Ching Chen; Daniel J. Craig; Angela del Pozo; Jonathan Foox; Margherita Francescatto; Yutao Fu; Cesare Furlanello; Kristina Giorda; Kira P. Grist; Meijian Guan; Yingyi Hao; Scott Happe; Gunjan Hariani; Nathan Haseley; Jeff Jasper; Giuseppe Jurman; David Philip Kreil; Paweł Łabaj; Kevin Lai; Jianying Li; Quan-Zhen Li; Yulong Li; Zhiguang Li; Zhichao Liu; Mario Solís López; Kelci Miclaus; Raymond Miller; Vinay K. Mittal

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Rules for determining Class 1 positive variants (by each genome version)

WJ Wendell Jones

BG Binsheng Gong

NN Natalia Novoradovskaya

DL Dan Li

RK Rebecca Kusko

TR Todd A. Richmond

DJ Donald J. Johann, Jr

HB Halil Bisgin

SS Sayed Mohammad Ebrahim Sahraeian

PB Pierre R. Bushel

MP Mehdi Pirooznia

KW Katherine Wilkins

MC Marco Chierici

WB Wenjun Bao

LB Lee Scott Basehore

AL Anne Bergstrom Lucas

DB Daniel Burgess

DB Daniel J. Butler

SC Simon Cawley

CC Chia-Jung Chang

GC Guangchun Chen

TC Tao Chen

YC Yun-Ching Chen

DC Daniel J. Craig

AP Angela del Pozo

JF Jonathan Foox

MF Margherita Francescatto

YF Yutao Fu

CF Cesare Furlanello

KG Kristina Giorda

KG Kira P. Grist

MG Meijian Guan

YH Yingyi Hao

SH Scott Happe

GH Gunjan Hariani

NH Nathan Haseley

JJ Jeff Jasper

GJ Giuseppe Jurman

DK David Philip Kreil

PŁ Paweł Łabaj

KL Kevin Lai

JL Jianying Li

QL Quan-Zhen Li

YL Yulong Li

ZL Zhiguang Li

ZL Zhichao Liu

ML Mario Solís López

KM Kelci Miclaus

RM Raymond Miller

VM Vinay K. Mittal

This method is extracted from research article: Genome Biol, Apr 2021

A verified genomic reference sample for assessing performance of cancer panels detecting small variants of low allele frequency

DOI: 10.1186/s13059-021-02316-z

Request a Protocol

Ask a question

Favorite

There were a diversity of variant calling pipelines used by members of the SEQC consortium. We asked the bioinformaticians of each organization to develop their preferred pipelines with their best expertise to call variants on their selected WES and WGS datasets. For WES datasets, there were twenty-two pipelines developed by nine teams as shown in Additional file 1: Table S3. All teams selected certain WES datasets for which they had their best experience. Each WES1–3 dataset was analyzed by seven to fourteen different pipelines on either reference genome versions. All teams selected certain WES datasets for which they had their best experience. Each WES1–3 dataset was analyzed by seven to fourteen different pipelines on either reference genome versions.

The freedom of choice of datasets, reference genome versions, mappers, and callers as well as parameters and filters created diversity and resulted in marginal to significantly different results between variant calling pipelines on the same input data. We investigated the similarity of the pipeline-library combinations (PLCs) in terms of variant calling on the individual UHR cell lines. The results did not fall into simple patterns. Many PLCs provided quite similar variant calls while outlier pipelines were also detected.

To achieve consensus, we defined a Class 1 positive variant as having at least half of the PLCs call the variant with alternative allele frequency (VAF) no less than 10% on the same cell line for each of WES1–3. The variant list for each cell line was then pooled together across the cell lines by kit to generate a non-redundant list of variants for the pooled Sample A by kit. We then took the intersection of the non-redundant variants called for each of WES1, WES2, and WES3 to compose the Class 1 list of variants, which are defined as known positives in this study. We also considered the region for which we would define the Class 1 variants. Only variants called within the CTR were termed as Class 1 known positives. We performed this procedure for hg19 then conducted a liftover for mapping to hg38 genome positions.

The Class 1 positives are not a complete list of variants for pooled Sample A. However, given (i) the large sequencing depth, (ii) only variants with VAF ≥ 10% were considered by cell line, (iii) the variants were selected by voting with multiple PLCs with diversity of callers and also agreed among three WES datasets, and (iv) a random sample of 114 (including 33 at VAF < 5%) of these variants were 100% orthogonally verified by ddPCR, we consider the Class 1 variants to be known positives.

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol