Partial least square discriminant analysis

AC Antonio Currà
RG Riccardo Gasbarrone
AC Alessandra Cardillo
CT Carlo Trompetto
FF Francesco Fattapposta
FP Francesco Pierelli
PM Paolo Missori
GB Giuseppe Bonifazi
SS Silvia Serranti
request Request a Protocol
ask Ask a question
Favorite

To classify and predict the two classes of muscle groups, we used PLS-DA, a statistical method that explores the predictive models between predictor and response variables55,73. It is essentially an inverse-least square approach to the linear discriminant analysis, another multivariate inverse least squares discrimination method. In PLS-DA, the PLS regression is utilized to develop a model able to predict the class number for each sample under study77,78. To evaluate the performance of the classification model, we used the confusion matrix with commonly used performance metrics: Precision, Accuracy, Misclassification Error, Sensitivity, and Specificity79,80.

To discriminate individual spectra acquired from the dorsal and ventral arm, the calibration set was built by adopting the KNN score distance, rooted to the PCA performed on the anthropometric data of the individual objects of the investigation81. The KNN score distance is essentially a measure of the average distance to the k nearest neighbors in score space for each sample. Using this procedure, a more reliable calibration set is obtained, that chooses the tails of the sample distribution [i.e., samples with KNN score distance (k = 3) > 0.433]. Hence, spectra collected from 14 subjects (70% of total spectra, n = 1400) were selected by the algorithm and included in the calibration set, whereas spectra from the remaining 6 subjects (30% of total spectra, n = 600) were included in the validation set. This PLS-DA model was subjected to the Venetian Blinds cross-validation method, and two LVs were chosen using the whole noise-cleaned instrument spectral range (450–2500 nm).

Separate PLS-DA classification models were set up based on single-detector spectral range subsets. According to the detecting features of the 3 sensing units embedded inside the ASD FieldSpec® device, the raw dataset (450–2500 nm) was split into three parts: VNIR (450–1000 nm), SWIR 1 (1001–1800 nm), and SWIR 2 (1801–2500 nm). Each split dataset underwent the same procedure described beforehand (i.e., 70% of the preprocessed dataset of individual spectra for each spectral region - VNIR, SWIR 1 and SWIR 2 - included in the calibration set; the remaining 30% of total spectra included into the validation set). These PLS-DA models were subjected to the Venetian Blinds cross-validation method; 2, 3 and 8 LVs were selected to model VNIR, SWIR 1, and SWIR 2 datasets, respectively.

A PLS-DA was applied to the preprocessed dataset of individual spectra (n = 2000, 450–2500 nm spectral range) in order to discriminate the subjects’ sex. The calibration set was built by randomly selecting 70% of the preprocessed spectral samples with the Kennard-Stone algorithm. To assess the optimal complexity of the model, and to select the number of LVs, we used Venetian Blinds as a cross-validation method76. Two LVs were chosen, and the model was validated using the remaining 30% of the preprocessed dataset of individual spectra.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

post Post a Question
0 Q&A