After passage through a Plasmodipur filter (EuroProxima) to remove white blood cells, DNA was purified from schizont-stage parasites using the phenol/chloroform extraction method. DNA quality was highest for the A1-H.1 and A1-C.2 lines, so these were selected for the generation of large insert libraries. HiSeq 2000 (Illumina) compatible genomic shotgun libraries (500-bp fragment size) were prepared according to the manufacturer’s protocols (Illumina). Additional PCR-free libraries were prepared and sequenced on MiSeq (Illumina). Poor-quality bases from the raw reads were trimmed using Trimmomatic (www.usadellab.org/cms/?page=trimmomatic). For genome assemblies, 20-kbp insert template libraries were prepared for single-molecule real-time (SMRT) sequencing chemistry at Macrogen, Inc. Samples were sequenced on a PacBio RS-II instrument (Pacific Biosciences). Genome assembly was performed using PacBios’ FALCON tools (https://github.com/PacificBiosciences/FALCON), and consensus scaffolding was achieved through the Genome Consensus suite (https://github.com/PacificBiosciences/GenomicConsensus). PacBio sequencing generated 361,197 and 355,195 reads from three SMRT cells each for PKNA1-C.2 and PKNA1-H.1 genomic DNA, respectively. Reads with a mean length just above 8 kb were generated to obtain a theoretical coverage of 122.6-fold (for PKNA1-C.2) and 124.1-fold (for PKNA1-H.1) of the genomes. Genome assembly resulted in 45 and 37 contigs for PKNA1-C.2 and PKNA1-H.1, respectively, and after PCR-free Illumina read correction the total assembly sizes were 24,393,488 bp and 23,988,125 bp, respectively. Consensus scaffolds were error-corrected using the PCR-free Illumina sequence, and annotation was transferred from the reference genome of the P. knowlesi H strain (24) using the Rapid Annotation Transfer Tool, RATT (41). Gene models were checked in Artemis BAM view where flawed models (containing in-frame stop codons or lacking start or end codons) were corrected using strand-specific RNA-seq datasets from schizont stages of the human RBC-adapted parasite PKA1-H.1. RNA-seq analysis of schizont stages of A1-H.1 and ΔNBPXa parasites was performed to identify differential gene expression (three biological replicates for each, P value of 0.05 corrected for false discovery rate). Significance was determined using cuffdiff and CummeRbund (bioconductor.org/packages/release/bioc/html/cummeRbund.html).
The PacBio sequencing, assembly, and annotation information for both genomes is summarized in Table S1. Genomic deletions were determined by cross-mapping the reads against the reciprocal assembly and against the reference H strain (24) using the Burrows–Wheeler Alignment tool. GATK pipeline (Broad Institute) was used to perform variant calling with the corrected assemblies as references. GATK-based filtering was applied with the following parameters: quality-to-depth ratio of 5 and Fisher strand value of 60 for SNPs and 200 for indels. Variants having an allele frequency of 80% were retained for further analysis. Additional variant filtering and comparisons were performed using VCFtools (https://github.com/vcftools/vcftools). Natural variation (SNP frequency) in the five invasion-related genes NBPXa (PKNA1_H1_1472300), NBPXb (PKNA1_H1_0700200), DBLα (PKNA1_H1_0623500), DBLβ (PKNA1_H1_1400800), and DBLγ (PKNA1_H1_1356900) was determined using the Illumina raw sequence datasets from 48 P. knowlesi infections (25) compared with the reference PKA1-H.1 sequences using the GATK pipeline, and the SNPs were visualized using ggplot2 (ggplot2.org). A collection of three tools was used for computational analysis of copy number variation in addition to manual examination of the Illumina reads mapped onto the genome assembly of P. knowlesi H strain: Control-FREEC (42) was used with a permissive breakpoint threshold of 0.35 and a coefficient of variation of 0.07; CNVnator (43) was used with a window size of 150; and GROM-RD (44) was run with a P value of 0.001, a duplication coverage threshold of 3, and a sampling rate of 15.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.