Statistical analysis

Agnese Aguzzoni; Michele Bassi; Emanuela Pignotti; Peter Robatscher; Francesca Scandellari; Werner Tirler; Massimo Tagliavini

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Statistical analysis

AA Agnese Aguzzoni

MB Michele Bassi

EP Emanuela Pignotti

PR Peter Robatscher

FS Francesca Scandellari

WT Werner Tirler

MT Massimo Tagliavini

This method is extracted from research article: J Sci Food Agric, Mar 2021

Multi‐chemical analysis combined with chemometrics to characterize PDO and PGI Italian apples

DOI: 10.1002/jsfa.11156

Request a Protocol

Ask a question

Favorite

After data logarithmic transformation, two different models were tested to highlight statistical differences among cultivation areas. The first one was a linear regression model in which only a fixed effect was considered, namely the cultivation area. The second model was a linear mixed model in which, beside the fixed effect (cultivation area), the sampling site was included as random effect in order to take into account the hierarchical structure of the data. Results of the two models were compared through the analysis of variance (ANOVA) test and the outputs of the model with the lowest AIC (Akaike information criterion) were chosen. Level of significance was fixed at P‐value = 0.05. Tukey HSD (honestly significant difference) post hoc test was applied for multiple comparisons among cultivation areas.

To improve the identification of sample origin based on the results of the multi‐chemical approach, multivariate data analysis was performed based on a supervised classification method, namely the LDA. A first model was developed analysing the multi‐chemical composition of 117 apple samples divided in four main groups according to their cultivation areas. Then, a second model was developed limiting the analysis to the multi‐chemical composition of South Tyrolean apples PGI (51 apple samples). For this second model, the grouping factor was the cultivation district (three groups). Prior to LDA, data were centred and scaled. The discriminant models were validated by ‘leave‐one‐out’ cross‐validation. Results of the confusion matrix were elaborated to get the sensitivity, specificity, precision, false discovery rate and balanced accuracy of the developed model applying the following equations:

where TP stands for true positive, TN for true negative, FP for false positive, and FN for false negative.⁴¹

The statistical analysis was performed using the computing environment R (R Core Team, Vienna, Austria, 2016).

This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol