Genomic Prediction Models

DJ Diego Jarquin
RH Reka Howard
ZL Zhikai Liang
SG Shashi K. Gupta
JS James C. Schnable
JC Jose Crossa
request Request a Protocol
ask Ask a question
Favorite

For charactering the i th hybrid, this model uses the genomic information from the inbreds via the GCA of the parents; thus, the male and female effects can be modeled. This model is composed by two genetic scores, which are derived from the main effects of the markers of those inbreds acting as parent 1 or B-lines (g P1i) and parent 2 or R-lines (g P2i), respectively. In the model gP1i=m=1pxP1imbm and gP2i=m=1pxP2imbm are the genetic effects modeled as the linear combination between p marker covariates (x im) and their corresponding marker effects (b m) for m = 1, 2,…, p with bmiidN(0,σb2) and σb2 acting as the common marker effect variance acting as the common marker effect variance; iid stands for independent and identically distributed. X P1, X P2 (with dimensions of 20 × 28,495 for platform C and 20 × 30,222 for platform T, and 13 × 28,495 for platform C and 13 × 30,222 for platform T, respectively) are the corresponding marker matrices of the inbreds (acting as parent 1 and parent 2, respectively) involved in the hybrid genotypes (Bernardo, 1994; Technow et al., 2014; Kadam et al., 2016) and these contain the number of copies of the major allele for each inbred at each marker position such that x∈{0, 2}. Collecting the aforementioned results and assumptions, the linear predictor for modeling the hybrid performance via the GCA of inbreds is obtained as follows

where y ij is the yield performance of the i th (i = 1, 2, …, I) hybrid in the j th (j = 1, 2,…,J) environment, μ is the common mean, E j is the main effects of the j th environments such that EjiidN(0,σE2), gP1={gP1i}N(0,GP1σP1g2) and gP2={gP2i}N(0,GP2σP2g2) with GP1=XP1XP1p, GP2=XP2XP2p, σP1g2=p×σbP12 and σP2g2=p×σbP22 as the corresponding variance components of the parental effects, and eijiidN(0,σe2) and σE2, and σe2 represent the associated variance components of environments, and residual terms. One of the disadvantages of this model is that it does not take into consideration the specific effect of crossing parent 1 with parent 2, but rather the average effects between both parents. Moreover, it returns a common genetic effect for the same hybrid in different environments.

This model is an extension of model M1, and it not only accounts for the main effects of the genetic components of the inbreds, but also includes the specific interaction effect of crossing inbred parent 1 and parent 2 (Acosta-Pech et al., 2017). The main effect is accounted for by the GCA component, and the interaction effect is accounted for by the SCA component. The SCA was modeled using the cell-by-cell product of the entries of the co-variance structures from inbred parent 1 (G P1) and inbred parent 2 (G P2), such that gP1×P2={gP1i×P2i}N(0,GP1×P2σP1g×P2g2), where GP1×P2= (ZgP1GP1ZgP1)°(ZgP2GP2ZgP2), σP1g×P2g2 is the variance component associated with this interaction term, and Z gP1 and Z gP2 are the corresponding incidence matrices for parent 1 and parent 2 for the hybrids such that these are of order 276 × 20 and 276 × 13 for the case when no phenotypic inbred data was included in calibration sets and of 309 (276 + 33) × 20 and 309 (276 + 33) × 13 when augmenting calibration sets with phenotypic inbred data.

The model in which both the GCA and the SCA components are included can be written as

where all of the terms are defined above. Although this model consider the effects of crossing parent 1 with parent 2 it also return a common genetic effect across environments for same hybrid in different environments similarly to the previous model.

This model is an extension of M2, in that it includes both the GCA and SCA components but also accounts for the interaction of the inbred markers with environments by including the interaction between the GCA and SCA components and environments. The model can be written as

where gEP1={gEP1ij}N(0,(ZgP1GP1ZgP1)°(ZEZE)σgEP12), gEP2={gEP2ij}N(0,(ZgP2GP2ZgP2)°(ZEZE)σgEP22) and gEP1xP2={gEP1ij×P2ij}N(0,(I4((ZgP1GP1ZgP1)°(ZgP2GP2ZgP2)))°(ZEZE)σgEP1×P22)

where σgEP12, σgEP22 and σgEP1×P22 are the corresponding variance components for interaction terms between markers of inbreds and environments for the GCA (parent 1 and parent 2) and SCA (P1 × P2) terms; Z E is the corresponding incidence matrix for environments of order (276 × 4 = 1 104) × 4 for the case when the phenotypic information of inbreds was omitted and (309 × 4 = 1236) × 4 for the case when the calibration sets were augmented with phenotypic inbred data. The genetic effects of the genotypes derived from this model are particular to each environment.

The model components of M1-M3 are listed in Table 1 , which shows how the models compare in terms of main and interaction effects. The main effect components are GP1, and GP2 [main effects of inbred markers accounting for paternal/maternal effects (GCA)], and GP1 × P2 [interaction between inbred markers for paternal/maternal effects (SCA)], and the interaction effects are GP1 × E, and GP2 × E (interaction between inbred markers and environments), and GP1 × P2 × E (interaction between SCA effects and environments). The described models (M1-M3) were fitted using the Bayesian Generalized Linear Regression (BGLR) R package (Perez-Rodriguez and de los Campos, 2014).

Main and interaction components of three models (M1-M3) used for predicting crop yield performance.

The main effects are GP1, and GP2 [main effects of inbred markers accounting for paternal/maternal effects (GCA)], and GP1 × P2 [interaction between inbred markers for paternal/maternal effects (SCA)]; the interaction effects are GP1 × E, and GP2 × E (interaction between inbred markers and environments), and GP1 × P2 × E (interaction between SCA effects and environments).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A