Phenotypic data analysis

MB Matheus Baseggio
MM Matthew Murray
DW Di Wu
GZ Gregory Ziegler
NK Nicholas Kaczmar
JC James Chamness
JH John P Hamilton
CB C Robin Buell
OV Olena K Vatamaniuk
EB Edward S Buckler
MS Margaret E Smith
IB Ivan Baxter
WT William F Tracy
MG Michael A Gore
request Request a Protocol
ask Ask a question
Favorite

For each plot sample, the determination of elemental concentration by an inductively coupled plasma mass spectrometer (ICP-MS) was conducted separately for each of the three lyophilized kernels as previously described in Baxter et al. (2014). In short, each individual unground kernel was robotically weighed, digested in concentrated nitric acid, and measured for concentrations of aluminum, arsenic, boron, cadmium, calcium, cobalt, copper, iron, magnesium, manganese, molybdenum, nickel, phosphorus, potassium, rubidium, selenium, sodium, strontium, sulfur, and zinc with a PerkinElmer NexION 350 D ICP-MS. Of these 20 elements, aluminum, arsenic, cobalt, selenium, and sodium were not further considered because their measured concentrations were at trace levels, vulnerable to contamination in the course of sample processing, and/or sensitive to interference from other sample matrix constituents (Ziegler et al. 2013). To limit the influence of extreme analytical outliers that could negatively affect the accurate estimation of variance components when initially fitting a mixed linear model to the raw data, the method of Davies and Gather (1993) was implemented similarly to its use in Baxter et al. (2014) to remove raw concentration values with greater than a conservative threshold of 15 median absolute deviations from the median concentration for a given element within each environment. Also, if less than 1% of the values for a given element were negative, these negative values were set to missing.

The preliminarily processed raw ICP-MS dataset was more robustly screened for significant outliers by fitting a mixed linear model that allowed for genetic effects to be separately estimated from field design effects, following the procedure described in Wolfinger et al. (1997). The fitted mixed linear model was similar to that used by Baseggio et al. (2019) for the same experimental field design, with the notable exception that the model used in this study included a term to estimate within-plot kernel sample variance. This allowed for the removal of individual outlier measurements. For each elemental phenotype, the full model was fitted in ASReml-R version 3.0 (Gilmour et al. 2009) across locations (all four environments) or for each location separately (two environments, NY; or two environments, WI) as follows:

in which Yijklmnopq is an individual phenotypic observation, μ is the grand mean, checki is the fixed effect for the ith check, envj is the effect of the jth environment, set(env)jk is the effect of the kth set within the jth environment, block(set × env)jkl is the effect of the lth incomplete block within the kth set within the jth environment, genotypem is the effect of the mth experimental genotype (noncheck line), (genotype × env)jm is the effect of the interaction between the mth genotype and jth environment, ICP-MS.runn is the laboratory effect of the nth ICP-MS run, sampleo is the oth kernel sample, row(env)jp is the effect of the pth plot grid row within the jth environment, col(env)jq is the effect of the qth plot grid column within the jth environment, and εijklmnopq is the heterogeneous residual error effect within each environment with a first-order autoregressive correlation structure among plot residuals in the row and column directions. Except for the grand mean and check term, all terms were modeled as random effects. The Kenward-Roger approximation (Kenward and Roger 1997) was used to calculate degrees of freedom. Studentized deleted residuals (Neter et al. 1996) obtained from these mixed linear models were used to detect significant outliers for each phenotype after a Bonferroni correction (α = 0.05).

To generate the best linear unbiased predictor (BLUP) values for each elemental phenotype, an iterative mixed linear model fitting procedure was conducted on the outlier-screened phenotypic dataset in ASReml-R version 3.0 (Gilmour et al. 2009) with the full model across locations or for each location separately. Model terms fitted as random effects including the autoregressive correlations were tested with likelihood ratio tests (Littell et al. 2006), followed by the removal of terms from the model that were not significant at α = 0.05. The significance of main random effects and variance component estimates are reported in Supplementary Table S1. In addition, the first-order autoregressive correlation structure was statistically significant for all phenotypes. For each elemental phenotype, the final, best-fitted model was used to generate a BLUP for each inbred line. The generated BLUP values were filtered to remove nonsweet corn lines, as well as sweet corn lines with the infrequent aeduwx or bt2 endosperm mutations and those without available SNP marker data. This resulted in 401 sweet corn lines with more prevalent endosperm mutations [su1, su1se1 (classified as su1 for this study due to lack of informative marker genotypes), sh2, and su1sh2] that had BLUP values for elemental phenotypes across and within locations.

With variance component estimates from each best fitted model, heritability on a line-mean basis was calculated for each elemental phenotype across locations and separately for each location as previously described (Lynch and Walsh 1998; Holland et al. 2003; Hung et al. 2012). Pearson’s correlation coefficient (r) was used to assess the degree of association between the BLUP values of paired phenotypes. Pairwise correlations were calculated, and their significance tested at α = 0.05 with the method “pearson” from the function “cor.test” in R version 3.6.1 (R Core Team 2019).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A