GBS library preparation and sequencing were performed at the Institute of Genomic Diversity (Cornell University, Ithaca, NY, United States) as described by Elshire et al. (2011). Genome complexity was reduced by digesting individual genomic DNA samples with EcoT22I, a methylation-sensitive restriction enzyme. The resultant fragments from each sample were directly ligated to a pair of enzyme-specific adapters and combined into pools. PCR amplification was carried out to generate the GBS libraries, which were sequenced on the Illumina HiSeq 2500 platform (Illumina Inc., United States). The raw data were processed, and SNP calling was performed using TASSEL 5.0 (Glaubitz et al., 2014). Initially, the FASTQ files were demultiplexed according to the assigned barcode. The reads from each sample were trimmed, and the tags were identified using the following parameters: Kmer length of 64 bp, minimum quality score within the barcode and read length of 20, minimum Kmer length of 20. All sequence tags from each sample were aligned to the reference rubber tree genome (Tang et al., 2016) with Bowtie 2 (Langmead and Salzberg, 2012) using the very-sensitive option.
To perform the analysis, the data were divided into the mapping population and germplasms. SNP calling was performed using the TASSEL 5 GBSv2 pipeline (Glaubitz et al., 2014) and filtered using VCFtools (Danecek et al., 2011) with the following criteria: (1) missing data of 20%, (2) minor allele frequency (MAF) greater than or equal to 5% (MAF 0.05), and (3) biallelic SNPs only.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.