2.2. Genomic prediction models

Peter Skov Kristensen; Pernille Sarup; Dario Fé; Jihad Orabi; Per Snell; Linda Ripa; Marius Mohlfeld; Thinh Tuan Chu; Joakim Herrström; Ahmed Jahoor; Just Jensen

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

2.2. Genomic prediction models

PK Peter Skov Kristensen

PS Pernille Sarup

DF Dario Fé

JO Jihad Orabi

PS Per Snell

LR Linda Ripa

MM Marius Mohlfeld

TC Thinh Tuan Chu

JH Joakim Herrström

AJ Ahmed Jahoor

JJ Just Jensen

This method is extracted from research article: Front Plant Sci, Nov 2023

Prediction of additive, epistatic, and dominance effects using models accounting for incomplete inbreeding in parental lines of hybrid rye and sugar beet

DOI: 10.3389/fpls.2023.1193433

Request a Protocol

Ask a question

Favorite

Three genomic prediction models were evaluated. Model 1 (M1) was based on the GCA-model developed by González-Diéguez et al. (2021). Here, it is assumed that each hybrid was produced from completely inbred parental lines. The maternal lines belonged to heterotic group 1, and the paternal lines belonged to heterotic group 2. Genotypes of the two-way crosses were imputed from the genotypes of the MS and NR lines, and these genotypes were used in the calculation of genomic relationship matrices. If a SNP marker was heterozygous in a parental line (alleles B₁b₁ in group 1 or B₂b₂ in group 2), it was randomly assigned to one of the two homozygous genotypes (B₁B₁ /b₁b₁ or B₂B₂/b₂b₂ , respectively).

In model 2 (M2) and model 3 (M3), the genotypes of the MS and NR lines were used directly in the calculations in order to better utilize the genomic relationship between the lines within group 1. In M2, the R lines and the two-way crosses could have arbitrary levels of inbreeding, while the MS and NR lines were assumed completely inbred. Any heterozygous SNPs in the MS and NR lines were randomly assigned to one of the two homozygous genotypes. In M3, the R lines, the two-way crosses, and the MS and NR lines could all have arbitrary levels of inbreeding. If all parental components are completely inbred, then M2 and M3 are equivalent to M1.

Thus, the M1 model was:

where y is the vector of phenotypes of the hybrids; X is the design matrix for fixed effects (year x location x trial. For rye, treatment was included as a second fixed effect); b is the vector of fixed effects; T ₁ and T ₂ are design matrices to assign hybrids to their parental lines in heterotic group 1 and 2, $ɡ_{A_{(1)}}$ and $ɡ_{A_{(2)}}$ are vectors of additive genetic effects from parental lines from group 1 and 2, respectively, with $ɡ_{A_{(1)}} \sim N (0, G_{A_{(1)}} σ_{A_{(1)}}^{2})$ and $ɡ_{A_{(2)}} \sim N (0, G_{A_{(2)}} σ_{A_{(2)}}^{2})$ , where $σ_{A_{(1)}}^{2}$ and $σ_{A_{(2)}}^{2}$ are additive genetic variances and $G_{A_{(1)}}$ and $G_{A_{(2)}}$ are additive genomic relationship matrices; $ɡ_{A A_{(1)}}$ and $ɡ_{A A_{(2)}}$ are vectors of additive-by-additive epistatic effects within heterotic group 1 and 2, respectively, with $ɡ_{A A_{(1)}} \sim N (0, G_{A A_{(1)}} σ_{A A_{(1)}}^{2})$ and $G_{A A_{(2)}} \sim N (0, G_{A A_{(2)}} σ_{A A_{(2)}}^{2})$ , where $σ_{A A_{(1)}}^{2}$ and $σ_{A A_{(2)}}^{2}$ are epistatic genetic variances within each heterotic group and $G_{A A_{(1)}}$ and $G_{A A_{(2)}}$ are within-group epistatic genomic relationship matrices; T ₃ is the design matrix for the effects of the hybrids; $ɡ_{A A_{(3)}}$ is the vector of additive-by-additive epistatic effects between alleles from heterotic group 1 and 2, respectively, with $ɡ_{A A_{(3)}} \sim N (0, G_{A A_{(3)}} σ_{A A_{(3)}}^{2})$ , where $σ_{A A_{(3)}}^{2}$ is epistatic genetic variance between the heterotic groups and $G_{A A_{(3)}}$ is the across-group epistatic genomic relationship matrix; g _D is the vector of genetic dominance deviations due to within locus interactions between alleles from different heterotic groups with $ɡ_{D} \sim N (0, D σ_{D}^{2})$ , where $σ_{D}^{2}$ is genetic dominance variance and D is the dominance relationship matrix across hybrids; r ₍₁₎ , r ₍₂₎ and r ₍ ₃ ₎ are vectors of residual genetic effects of lines from group 1 and 2 and of the hybrids, respectively, with $r_{(1)} \sim N (0, I_{r_{(1)}} σ_{r_{(1)}}^{2})$ , $r_{(2)} \sim N (0, I_{r_{(2)}} σ_{r_{(2)}}^{2})$ and $r_{(3)} \sim N (0, I_{r_{(3)}} σ_{r_{(3)}}^{2})$ , where $I_{r_{(1)}}$ , $I_{r_{(2)}}$ , and $I_{r_{(3)}}$ are identity matrices and $σ_{r_{(1)}}^{2}$ , $σ_{r_{(2)}}^{2}$ and $σ_{r_{(3)}}^{2}$ are residual genetic variances; T₄ , T₅ , and T₆ are design matrices for random effects of interactions between year x location and maternal parent, paternal parent or treatment (only included for rye), respectively, and k , l , and m are the vectors of the random effects of the interactions with $k \sim N (0, I_{k} σ_{k}^{2})$ , $l \sim N (0, I_{l} σ_{l}^{2})$ , and $m \sim N (0, I_{m} σ_{m}^{2})$ , where I_k , I_l , and I_m are identity matrices and $σ_{k}^{2}$ , $σ_{l}^{2}$ , and $σ_{m}^{2}$ , are variances for the interactions; e is the vector of random residual effects with $e \sim N (0, I_{e} σ_{e}^{2})$ , where I_e is an identity matrix and $σ_{e}^{2}$ is residual variance.

For M1, genomic relationship matrices were calculated as proposed by González-Diéguez et al. (2021):

Additive genomic relationship matrix for heterotic group 1:

where $p_{1_{i}}$ and $q_{1_{i}}$ are the frequencies of allele $B_{1_{i}}$ and $b_{1_{i}}$ for the i^th marker, respectively, and Z ₁ = M ₁ - P ₁; M ₁ is a matrix with genotypes of parental lines in group 1 coded as 0 for genotype b₁b₁ and 1 for genotype B₁B₁ for each marker; P ₁ is a matrix where each column contains the allele frequencies of $B_{1}$ , and nsnp is number of markers.

Additive-by-additive epistatic relationship matrix for lines within group 1 was calculated as the Hadamard product of the additive genomic relationship matrix for group 1 scaled by the trace of the resulting matrix divided by the number of lines in group 1 to get an average diagonal of 1:

The additive and epistatic genomic relationship matrices for heterotic group 2 were calculated in same way as for group 1.

Additive-by-additive epistatic relationship matrix between lines in group 1 and 2:

where n_H is the number of hybrids. The matrices $G_{A A_{(3)}}$ and D can both include realized hybrids as well as all potential crosses of the parental lines, so the crosses with the largest effects can be predicted even though they are not yet phenotypically tested.

Dominance relationship matrix of dominance interactions between alleles from different heterotic groups:

where $p_{1_{i}}$ , $q_{1_{i}}$ , $p_{2_{i}}$ and $q_{2_{i}}$ are the frequencies of the alleles $B_{1_{i}}$ and $b_{1_{i}}$ in heterotic group 1 and $B_{2_{i}}$ and $b_{2_{i}}$ in heterotic group 2 for the i^th marker, respectively, and W₁ is a matrix with a row for each hybrid and a column for each marker (González-Diéguez et al., 2021). The elements of W₁ are shown in Table 2 .

Elements of W₁, W₂, and W₃ for each marker in the hybrids from crosses between parental lines from group 1 and group 2, which are used in the calculation of the dominance relationship matrix for M1, M2, and M3, respectively*.

*The elements in w₁ are from González-Diéguez et al. (2021), and remaining elements were derived in this study ( Appendix 1 ).

It should be noted that the mean heterosis of the hybrids is not estimated separately in the model but is included in the overall mean of the hybrid phenotypes. Thus, the across-group epistatic and dominance effects that are estimated are deviations of individual hybrids from the mean heterosis.

In M2, paternal R lines and maternal two-way crosses could have arbitrary levels of inbreeding, while MS and NR lines were assumed completely inbred. Genotypes of MS and NR were used for the calculation of additive and epistatic genomic relationship matrices for heterotic group 1. If an MS and NR lines had the same genotypes for all SNPs, it was only included once in the relationship matrices.

Thus, the M2 model was:

where y is the vector of phenotypes of the three-way hybrids; T ₇ and T ₈ are design matrices for MS and NR, respectively; $ɡ_{A_{(1, 1)}}$ , $ɡ_{A A_{(1, 1)}}$ , and $r_{A_{(1, 1)}}$ are vectors of additive, epistatic, and residual genetic effects for both MS and NR, respectively, with $ɡ_{A_{(1, 1)}} \sim N (0, \frac{1}{2} G_{A_{(1, 1)}} σ_{A_{(1, 1)}}^{2})$ , $ɡ_{A A_{(1, 1)}} \sim N (0, G_{A A_{(1, 1)}} σ_{A A_{(1, 1)}}^{2})$ , and $r_{(1, 1)} \sim N (0, I_{r_{(1, 1)}} σ_{r_{(1, 1)}}^{2})$ , where $σ_{A_{(1, 1)}}^{2}$ , $σ_{A A_{(1, 1)}}^{2}$ and $σ_{r_{(1, 1)}}^{2}$ are additive, within-group epistatic and residual genetic variances for MS and NR, and $G_{A_{(1, 1)}}$ and $G_{A A_{(1, 1)}}$ are additive and epistatic genomic relationship matrices, and $I_{r_{(1, 1)}}$ is an identity matrix. $G_{A_{(1, 1)}}$ was scaled by ½ to account for the first cross between MS and NR, which produced the two-way cross. Additionally, M ₂ , which was used in the calculation of the additive genomic relationship matrix for group 2 ( $G_{A_{(2)}}$ ) now included heterozygous genotypes B₂b₂ coded as 0.5. The marker matrix for the dominance relationship matrix, W ₂ , was extended to account for heterozygous genotypes in the two-way crosses and in the R lines, which now have twelve possible crossing combinations instead of four in M1 ( Table 2 ). The additive-by-additive epistatic relationship matrix between lines in group 1 and 2 was calculated as:

In M3, the same model parameters were used as for M2 (Equation 6), but now every parental line (MS, NR, two-way crosses, and R) could have arbitrary levels of inbreeding. Therefore, M ₁, which was used in the calculation of the additive genomic relationship matrix for MS and NR ( $G_{A_{(1, 1)}}$ ) included heterozygous genotypes B₁b₁ coded as 0.5. The marker matrix for the dominance relationship matrix, W ₃ , was further extended to account for heterozygous genotypes in all parental lines, which now have 27 possible crossing combinations ( Table 2 ).

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol