2.6 Partial Least Squares Regression (PLSR) model comparison

R. Zeng; J. P. Zhang; K. Cai; W. C. Gao; W. J. Pan; C. Y. Jiang; P. Y. Zhang; B. W. Wu; C. H. Wang; X. Y. Jin; D. C. Li

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

2.6 Partial Least Squares Regression (PLSR) model comparison

RZ R. Zeng

JZ J. P. Zhang

KC K. Cai

WG W. C. Gao

WP W. J. Pan

CJ C. Y. Jiang

PZ P. Y. Zhang

BW B. W. Wu

CW C. H. Wang

XJ X. Y. Jin

DL D. C. Li

This method is extracted from research article: PLoS One, Mar 2021

DOI: 10.1371/journal.pone.0247028

Request a Protocol

Ask a question

Favorite

The reported optimum similarity indices selected by the accordance between spectral and compositional similarity were evaluated in terms of their predictive power. For each sample in the test dataset, all six properties were predicted by PLSR models based on similar samples matched in the training dataset using the five similarity indices.

The number of similar samples selected from the training dataset has a great effect on the model performance [19], which although important, was not the focus of the research here. Different sizes (n = 5, 10, 20, 30, 40, 50, 100, 150, 200, 250, 300, 400 and 500) were tested for prediction of SOC. Model performance achieved the highest predictive accuracy and stabilized around ~ 250; thus, this size was selected for all subsequent analyses of other physiochemical properties (S1 Fig). PLSR model performance was evaluated by the ratio of percent deviation (RPD; Eq 7):

where SD is the standard deviation of the observed property values for the test dataset, and RMSEP is the RMSE of the prediction (see Eq 4). For each sample in the test dataset, the most similar 250 samples in spectral space (as evaluated by different similarity indices) were selected from the training dataset to build the PLSR models for prediction of the soil properties. We followed the criteria proposed by Chang and Laird [20] to evaluate the performance of the PLSR models: (1) RPD < 1.4, the model is not able to predict the target property; (2) 1.4 ≤ RPD < 2.0, moderate model predictive performance; and (3) 2.0 ≤ RPD < 2.5, the model can predict the target property well.

This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol