Partial least squares—discriminant analysis (PLS-DA)

Gabrielle Nepomuceno; Carolina Victoria Cruz Junho; Marcela Sorelli Carneiro-Ramos; Herculano da Silva Martinho

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Partial least squares—discriminant analysis (PLS-DA)

GN Gabrielle Nepomuceno

CJ Carolina Victoria Cruz Junho

MC Marcela Sorelli Carneiro-Ramos

HM Herculano da Silva Martinho

This method is extracted from research article: Sci Rep, Jul 2021

Tyrosine and Tryptophan vibrational bands as markers of kidney injury: a renocardiac syndrome induced by renal ischemia and reperfusion study

DOI: 10.1038/s41598-021-93762-z

Request a Protocol

Ask a question

Favorite

All spectra were pre-processed to become comparable for the statistical analysis. The baseline was corrected using the least-squares polynomial curve fitting method as described by Lieber and Mahadevan-Jansen^¹⁷. All spectra were normalized to mean and scaled using Pareto’s scaling^¹⁸.

Then PLS-DA analysis was performed. PLS is a multivariate supervised method that uses linear regression of original variables to predict the class membership (Sham, 8D, 15D for heart and kidney groups). In our case the PLS regression was performed using the plsr function provided by R pls package^¹⁶,¹⁹. The classification and cross-validation were performed using the corresponding wrapper function using the caret package^¹⁹. A permutation test was performed to assess the performance of class discrimination. In each permutation, a PLS-DA model was built between the data and the permuted class labels using the optimal number of components determined by leave-one-out cross validation for the model based on the original class assignment. The class discrimination performance was measured using classification accuracy, R², and Q² parameters. The first one is based on prediction accuracy. The R² parameter is the “goodness of fit” or explained variation which is based on the ratio of the between group sum of the squares and the within group sum of squares. On the other hand, Q² is the “goodness of prediction”, or predicted variation, calculated from cross validation. In each round, the predicted data are compared with the original data, and the sum of squared errors is calculated being then summed over all samples (Predicted Residual Sum of Squares or PRESS). For convenience, the PRESS is divided by the initial sum of squares and subtracted from 1 to resemble the scale of the R². Good predictions will have low PRESS or high Q² while negative Q² means that model is not at all predictive or is overfitted^²⁰–²².

Two quantifiers were used to measure the vibrational band frequency importance in PLS-DA model. The first, Variable Importance in Projection (VIP) is a weighted sum of squares of the PLS loadings taking into account the amount of explained spectral intensity-variation in each dimension. The other importance measure is based on the weighted sum of PLS-regression. The weights are a function of the reduction of the sums of squares across the number of PLS components. For multiple-group analysis, the same number of predictors will be built for each group and the average of the feature coefficients were used to indicate the overall coefficient-based importance.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol