We called variants using the GATK4 HaplotypeCaller for each of the 51 samples and then performed joint genotype calling with the GATK4 GenotypeGVCFs tool for all samples included as simultaneous inputs. We used the GATK4 VariantFiltration to remove variants more extreme than a P value of 3.4e−6 in Hardy–Weinberg equilibrium, which corresponds to a phred-scaled value of 54.69.
We followed the guidelines of GATK for hard filtering (https://softw[are.broadinstitute.org/gatk/documentation/article?id=23216#2, https://software.broadinstitute.org/gatk/documentation/article?id=11069; last accessed April 5, 2021) to retain only high-quality, biallelic SNPs. First, we used the GATK SelectVariants tool to extract the SNPs from the raw VCF file. Then we filtered the SNPs using the GATK VariantFiltration tool with options “-filterExpression ‘QD < 2.0 ‖ FS > 60.0 ‖ MQ < 40.0 ‖ MQRankSum < −12.5 ‖ ReadPosRankSum < −8.0 ‖ SOR > 3.0’.” Then we removed any variants that fell within repetitive or low complexity regions using BEDTools version 2.25.0 (Quinlan and Hall 2010). To retain only biallelic sites, and to remove variants on the mitochondrial genome, we used the GATK SelectVariants tool with the “-restrict-alleles-to BIALLELIC -XL Sequoia_complete_mtGenome -exclude-filtered” options. We calculated the mean and standard deviation (SD) of the total unfiltered read depth across all samples per site, and removed all the variants exceeding the mean coverage plus five times the SD, as suggested by the GATK documentation. In addition to these basic filters, we filtered out individual variants with the minimum quality of assigned genotype (GQ) smaller than 40. We also removed the sites with missing data for all the analyses below except for the diversity analysis. For analyses of demography and genetic diversity, we removed four samples (ZRHG101, ZRHG123, ZRHG124, and ZRHG127) from the first degree relative pairs as described in the Supplementary Materials, Supplementary Material online.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.