Data Analysis

DT Davoud Torkamaneh
BB Brian Boyle
JS Jérôme St-Cyr
GL Gaétan Légaré
SP Sonia Pomerleau
FB François Belzile
request Request a Protocol
ask Ask a question
Favorite

Single-end sequence reads were processed using the Fast-GBS pipeline (Torkamaneh et al., 2017). In brief, FASTQ files were demultiplexed based on barcode sequences. Demultiplexed reads were trimmed and then mapped against the soybean reference genome [Williams82 (Gmax_275_Wm82.a2.v1)] (Schmutz et al., 2010). Nucleotide variants were identified from mapped reads. Variants were removed if (i) they had two or more alternate alleles, (ii) the overall base quality (QUAL) score was <10 (iii) the mapping quality (MQ) score was <30, and (iv) read depth of was <2. Finally, loci with >80% missing data were excluded. Sequencing reads, bases and genotype quality assessment were estimated using BCFtools (Li, 2011), VCFtools (Danecek et al., 2011), and TASSEL (Glaubitz et al., 2014). To estimate the accuracy of genotype calls, the resulting catalogue of variants for each method (StdGBS and NanoGBS) were compared with WGS data for the same lines from Torkamaneh et al. (2018b).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A