We performed adapter and quality trimming using BBDuk v.38.75 (https://sourceforge.net/projects/bbmap/ (accessed on 5 February 2020)), using “ref = resources/adapters.fa” that comes with BBMap/BBTools v. 38.75. For this, we selected the following settings: ktrim = r, k = 23, mink = 11, hdist = 1, tpe, tbo, qtrim = rl, trimq = 15. We then mapped quality and adapter trimmed reads to the CamDro3 ([23]; https://doi.org/10.5061/dryad.qv9s4mwb3) assembly using BBMap v. 38.75 (https://sourceforge.net/projects/bbmap/) with the “usejni = t” setting. BAM files were cleaned, sorted, read groups added, and duplicates marked with Picard v. 2.21.7 (http://broadinstitute.github.io/picard). We called SNPs against CamDro3 [24] with CallVariants v. 38.39 (https://sourceforge.net/projects/bbmap/), keeping only SNPs with quality scores greater than or equal to 27 using the settings “ploidy =2 multisample minscore = 27.0 nopassdot = t duplicate = f minreadmapq = 30”. We then used BCFTools 1.9 (http://samtools.github.io/bcftools/ (accessed on 12 March 2019)) to filter each individual’s raw VCF file to exclude sites with missing genotypes, kept only SNPs that passed “CallVariants”‘s filters, and if a site was multiallelic, kept the genotype with the highest quality score. We also used BCFTools to merge VCF files for each individual into a single VCF file and finally employed BEDTools 2.29.0 [37] to keep only the SNPs that occurred in the target region where the 120-bp baits mapped using blastn v. 2.2.31+ [38].
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.