2.4. Variant prediction calling, SNP filtering and principal component analysis (PCA)

D.G. Teixeira; G.R.G. Monteiro; D.R.A. Martins; M.Z. Fernandes; V. Macedo-Silva; M. Ansaldi; P.R.P. Nascimento; M.A. Kurtz; J.A. Streit; M.F.F.M. Ximenes; R.D. Pearson; A. Miles; J.M. Blackwell; M.E. Wilson; A. Kitchen; J.E. Donelson; J.P.M.S. Lima; S.M.B. Jeronimo

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

2.4. Variant prediction calling, SNP filtering and principal component analysis (PCA)

DT D.G. Teixeira

GM G.R.G. Monteiro

DM D.R.A. Martins

MF M.Z. Fernandes

VM V. Macedo-Silva

MA M. Ansaldi

PN P.R.P. Nascimento

MK M.A. Kurtz

JS J.A. Streit

MX M.F.F.M. Ximenes

RP R.D. Pearson

AM A. Miles

JB J.M. Blackwell

MW M.E. Wilson

AK A. Kitchen

JD J.E. Donelson

JL J.P.M.S. Lima

SJ S.M.B. Jeronimo

This method is extracted from research article: Int J Parasitol, Jun 2017

Comparative analyses of whole genome sequences of Leishmania infantum isolates from humans and dogs in northeastern Brazil☆

DOI: 10.1016/j.ijpara.2017.04.004

Request a Protocol

Ask a question

Favorite

The GATK v. 3.3 suite of tools (DePristo et al., 2011) was used to realign reads in regions with insertion/deletions (indels) and to perform the variant calling through HaplotypeCaller under diploid organism assumption. Since there is no training dataset to use as a parameter for SNP filtering, we used GATK hard filters to exclude false positives. For this purpose, the filters were applied as described by GATK Best Practices and the RMSMappingQuality option ≥30. After the SNP filtering step, the variant data were gathered in a single file using VCFtools package (Danecek et al., 2011). SNPRelate was used to remove SNPs in linkage disequilibrium, with a sliding window of 5000 nucleotides and a threshold of 2. This dataset was used for a PCA using the R package SNPRelate (Zheng et al., 2012). This dataset was also used to obtain supportive data for population structure using the program Admixture (Alexander et al., 2009). The snpEFF package (Cingolani et al., 2012) was used for SNP variant annotation, and genome annotation files were retrieved from GeneDB (Logan-Klumpler et al., 2012).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol