Joint genotype calling at all sites across the reference genome (including invariant positions) was performed with GATK HaplotypeCaller. Genotypes were filtered for quality and depth, leaving only high-quality biallelic SNPs and monomorphic sites. Only genotypes with at least six supporting reads and high quality (minimum Phred score of 20) were included. A per-individual excess depth filter, set at the 99th percentile of depth for each sample (to control for differences in coverage between samples), was also used. Each site was then filtered on the following criteria: Variants failing the recommended GATK hard filters were excluded (QD < 2.0 || FS > 60.0 || MQ < 40.0 || MQRankSum < −12.5 || ReadPosRankSum < −8.0 || SOR > 3.0), as well as sites with excess depth (>99th percentile for total depth across all samples), low Phred score (QUAL of <30), more than 20% missing data, excess heterozygosity (>50% of individuals heterozygous), or sites found within repeat regions and CpG islands [coordinates from (19)]. The pipeline used for genotype calling and filtering, with example commands, is available at https://doi.org/10.5281/zenodo.2666099.

Note: The content above has been extracted from a research article, so it may not display correctly.



Q&A
Please log in to submit your questions online.
Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.



We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.