Exome sequencing, processing, and analysis

SS Shabnam Shalapour
XL Xue-Jia Lin
IB Ingmar N. Bastian
JB John Brain
AB Alastair D. Burt
AA Alexander A. Aksenov
AV Alison F. Vrbanac
WL Weihua Li
AP Andres Perkins
TM Takaji Matsutani
ZZ Zhenyu Zhong
DD Debanjan Dhar
JN Jose A. Navas-Molina
JX Jun Xu
RL Rohit Loomba
MD Michael Downes
RY Ruth T. Yu
RE Ronald M. Evans
PD Pieter C. Dorrestein
RK Rob Knight
CB Christopher Benner
QA Quentin M. Anstee
MK Michael Karin
request Request a Protocol
ask Ask a question
Favorite

DNA was isolated as described above from 9 spleens and 16 HCC nodules as indicated. Exome sequencing was performed at the University of California San Diego, IGM Genomic Centre using standard protocols. The exome sequencing data have been deposited in the NCBI Short Read Archive under accession numbers SRA556071 and SRP104724.

Exome sequencing data were processed into genetic variants using the Genome Analysis Tool Kit (GATK). First, exome capture reads were aligned with BWA (version 0.7.15)58 to the mouse genome (GRCm38/mm10). GATK (version 3.5) was used to process all exome sequencing data following the recommended best practices workflow59,60. Base recalibration was performed using the Mouse Genome Project SNP database as a reference for known SNPs61. Variant calls were made using the GATK HaplotypeCaller while considering the entire data set together to maximize sensitivity. Somatic mutations were called by comparing HCC samples with a pool of splenic controls. Only sites with at least four variant allele reads and total coverage of at least ten reads in both target or control samples were considered. In addition, a minimum of 10% of target sample reads had to contain the variant allele, and no more than 2% of control sample reads could contain the variant allele. Since not all samples had a matching control, each sample was compared with all background spleen samples to remove any variants that might have been present in any of the splenic control samples. Variants were annotated using snpEff (v4.3)62. Coding mutations were defined as variants that directly impacted amino-acid changes or splice sites. Variant visualization was performed using IGV63.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A