We filtered sites using MAF > 0.05 and missing rate <0.1 and obtained 9,689,355 high confidence SNPs that were used for GWAS. We used EMMAX to generate the kinship matrix for all samples and to test for SNP associations with all traits (Kang et al. 2010). The kinship matrix was used as the variance-covariance matrix for the random effect and the population structure components from the Admixture analysis (K = 3) were included as fixed effects. The growth rate parameter b from logistic model of 303 accessions was mapped as quantitative traits. Because of the non-independence of SNPs caused by strong LD, it is usually too strict for significant association detection when multiple-test correction is performed based on the total number of markers (Li et al. 2012; Wang et al. 2016). To alleviate some of these issues, the effective number of independent markers (n) was calculated using Genetic type 1 Error Calculator (GEC) software (Li et al. 2012), resulting in a suggestive threshold for the control of the type 1 error rate of 1.84 × 10−8 (0.05/n, n = 2,721,994). LD analysis using the R package LD heatmap was used to define LD blocks surrounding significant SNPs by intervals (Shin et al. 2006). To reduce the noise of candidate gene identification, we selected a single gene candidate for each LD block, as whichever gene contains the SNPs within the coding region or is closest to the intergenic SNPs.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.