Exome sequencing, processing, and analysis

Shabnam Shalapour; Xue-Jia Lin; Ingmar N. Bastian; John Brain; Alastair D. Burt; Alexander A. Aksenov; Alison F. Vrbanac; Weihua Li; Andres Perkins; Takaji Matsutani; Zhenyu Zhong; Debanjan Dhar; Jose A. Navas-Molina; Jun Xu; Rohit Loomba; Michael Downes; Ruth T. Yu; Ronald M. Evans; Pieter C. Dorrestein; Rob Knight; Christopher Benner; Quentin M. Anstee; Michael Karin

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Exome sequencing, processing, and analysis

SS Shabnam Shalapour

XL Xue-Jia Lin

IB Ingmar N. Bastian

JB John Brain

AB Alastair D. Burt

AA Alexander A. Aksenov

AV Alison F. Vrbanac

WL Weihua Li

AP Andres Perkins

TM Takaji Matsutani

ZZ Zhenyu Zhong

DD Debanjan Dhar

JN Jose A. Navas-Molina

JX Jun Xu

RL Rohit Loomba

MD Michael Downes

RY Ruth T. Yu

RE Ronald M. Evans

PD Pieter C. Dorrestein

RK Rob Knight

CB Christopher Benner

QA Quentin M. Anstee

MK Michael Karin

This method is extracted from research article: Nature, Nov 2017

Inflammation-induced IgA+ cells dismantle anti-liver cancer immunity

DOI: 10.1038/nature24302

Request a Protocol

Ask a question

Favorite

DNA was isolated as described above from 9 spleens and 16 HCC nodules as indicated. Exome sequencing was performed at the University of California San Diego, IGM Genomic Centre using standard protocols. The exome sequencing data have been deposited in the NCBI Short Read Archive under accession numbers SRA556071 and SRP104724.

Exome sequencing data were processed into genetic variants using the Genome Analysis Tool Kit (GATK). First, exome capture reads were aligned with BWA (version 0.7.15)⁵⁸ to the mouse genome (GRCm38/mm10). GATK (version 3.5) was used to process all exome sequencing data following the recommended best practices workflow^59,60. Base recalibration was performed using the Mouse Genome Project SNP database as a reference for known SNPs⁶¹. Variant calls were made using the GATK HaplotypeCaller while considering the entire data set together to maximize sensitivity. Somatic mutations were called by comparing HCC samples with a pool of splenic controls. Only sites with at least four variant allele reads and total coverage of at least ten reads in both target or control samples were considered. In addition, a minimum of 10% of target sample reads had to contain the variant allele, and no more than 2% of control sample reads could contain the variant allele. Since not all samples had a matching control, each sample was compared with all background spleen samples to remove any variants that might have been present in any of the splenic control samples. Variants were annotated using snpEff (v4.3)⁶². Coding mutations were defined as variants that directly impacted amino-acid changes or splice sites. Variant visualization was performed using IGV⁶³.

Reprints and permissions information is available at www.nature.com/reprints.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol