DNA of MACS-purified parasites from the 10 LGS pairs pre-CQ and post-CQ treatment was whole genome-amplified as described above. Sample-indexed shotgun Illumina TruSeq libraries were generated from WGA DNA by recommended kit conditions (Illumina, San Diego, CA, USA). Of the 10 LGS pairs, six (from Aotus 85922, 85986, 85850, WR454, 85823, and Saimiri 4919) were fully analyzed. Results from four LGS sample pairs could not be included because recovered DNA was not sufficient from the PCR-positive-only recrudescences in the monkeys (Aotus 86121, 85891, Saimiri 5081, and 5076; Supplementary Fig. 4). Chromosomes 1 and 5 sheared fragments were captured using custom SureSelect RNA baits with 3 × tiling (Agilent Technologies, Santa Clara, CA, USA) and were sequenced on an Illumina HiSeq2000.
Reads were aligned to the P. vivax Salvador I reference assembly genome (version 3) using bwa-mem with default settings. Aligned reads were deduplicated with Picard Tools MarkDuplicates and then had mate information fixed with Picard Tools FixMateInformation. Finally, the aligned reads underwent local realignment of entropic regions using Genome Analysis Toolkit (GATK) IndelRealigner57–60. Read coverage and statistics were calculated with SAMtools flagstat61 and custom R scripts.
Variant discovery was performed using samtools mpileups with flags for max-depth of 10,000 bases, minimum base quality of 20, and a minimum mapping quality of 10. The output from mpileups was then passed to bcftools multiallelic-caller. Loci were filtered using bcftools filter for an INFO field mapping quality of <55 and if a SNP occurred within five-base pairs of an INDEL. Loci were also excluded if they were not within the core region defined by Pearson, Amato9 or if they were annotated in the P. vivax Salvador I GFF (version 3) as members of the msp, Pvfam-c, VIR, SERA, or STP family21. In addition, variants were subsetted to biallelic SNPs. These were filtered further and excluded if the within-sample allele depth was <10 or if the phred-scaled strand bias p-value was >10. After these filters were applied, variants were excluded across all samples if more than 2/12 samples had missing information or if the range of within-sample allele frequencies across all samples was <5% suggestive of genotyping error.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.