NIR spectra in our study were pre-processed by SNV + 2nd derivatives using Savitzky–Golay smoothing [43] with a window size of 17 data points. Partial least squares regression (PLSR) was used for model calibration using leave-one-out cross validation method. The coefficient of determination (R2) and root-mean-square error (RMSE) for both calibration and validation were used to track the model performance. Models were randomly performed 100 times using 80% of the data set for calibration and the remaining 20% for validation. The benefit of these randomized analyses was allowing for the assessment of the prediction model uncertainty and the overall model stability. R2 and RMSE were collected for each selection to assess the error of 100 calibration and validation model. The most important variables in the NIR spectra that highly explain the variation between variables and response chemical components were selected by using the filter method significant Multivariate Correlation (sMC) algorithm with a significance level of α = 0.05 [44]. This method is firstly estimating the variation of features from the PLSR model and then using these features to find out the significant feature for the PLSR model. The details of equation for sMC algorithm were described in other studies [44, 45].
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.