GWAS strategy

Liang Xiao; Yuanyuan Fang; He Zhang; Mingyang Quan; Jiaxuan Zhou; Peng Li; Dan Wang; Li Ji; Pär K Ingvarsson; Harry X Wu; Yousry A El-Kassaby; Qingzhang Du; Deqiang Zhang

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

GWAS strategy

LX Liang Xiao

YF Yuanyuan Fang

HZ He Zhang

MQ Mingyang Quan

JZ Jiaxuan Zhou

PL Peng Li

DW Dan Wang

LJ Li Ji

PI Pär K Ingvarsson

HW Harry X Wu

YE Yousry A El-Kassaby

QD Qingzhang Du

DZ Deqiang Zhang

This method is extracted from research article: Plant Cell, Jul 2023

Natural variation in the prolyl 4-hydroxylase gene PtoP4H9 contributes to perennial stem growth in Populus

DOI: 10.1093/plcell/koad212

Request a Protocol

Ask a question

Favorite

We filtered sites using MAF > 0.05 and missing rate <0.1 and obtained 9,689,355 high confidence SNPs that were used for GWAS. We used EMMAX to generate the kinship matrix for all samples and to test for SNP associations with all traits (Kang et al. 2010). The kinship matrix was used as the variance-covariance matrix for the random effect and the population structure components from the Admixture analysis (K = 3) were included as fixed effects. The growth rate parameter b from logistic model of 303 accessions was mapped as quantitative traits. Because of the non-independence of SNPs caused by strong LD, it is usually too strict for significant association detection when multiple-test correction is performed based on the total number of markers (Li et al. 2012; Wang et al. 2016). To alleviate some of these issues, the effective number of independent markers (n) was calculated using Genetic type 1 Error Calculator (GEC) software (Li et al. 2012), resulting in a suggestive threshold for the control of the type 1 error rate of 1.84 × 10⁻⁸ (0.05/n, n = 2,721,994). LD analysis using the R package LD heatmap was used to define LD blocks surrounding significant SNPs by intervals (Shin et al. 2006). To reduce the noise of candidate gene identification, we selected a single gene candidate for each LD block, as whichever gene contains the SNPs within the coding region or is closest to the intergenic SNPs.

This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs licence (https://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial reproduction and distribution of the work, in any medium, provided the original work is not altered or transformed in any way, and that the work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol