The FTIR spectra were recorded on a Nicolet iS10 FT-IR spectrometer (Thermo-Fisher Scientific, Waltam, MA, USA) equipped with a diamond crystal cell for attenuated total reflection (ATR) operation. The spectra were acquired (32 scans per sample or background) in the range of 4000–500 cm−1 at a nominal resolution of 4 cm−1. The spectra were corrected using the background spectrum of air. The analysis was carried out at room temperature. For a measurement, a lyophilized sample was placed on the surface of the ATR crystal. Before acquiring a spectrum, the ATR crystal was carefully cleaned with wet cellulose tissue and dried using a flow of nitrogen gas. The cleaned crystal was checked spectrally to ensure that no residue was retained from the previous sample. For each sample, ten spectra were recorded. The spectrum of every sample was collected 10 times to check the reproducibility and do a statistical analysis. In addition to FTIR, the samples were analyzed conventionally to determine the fatty acid and phenolic compound profiles in order to aid the interpretation of the spectra, see next sub-sections for details.
The FTIR spectra were evaluated in two different ways: qualitative analysis of spectra and discrimination analysis.
As a first step, they were analyzed with respect to the spectral band positions in order to identify the signatures of the major functional groups. An assignment of the main bands was carried out by analyzing the acquired spectra and by comparing them with the literature.
In the second step, principal component analysis (PCA) was applied to the dataset. PCA is a statistical method that reduces the dimensionality of a data set by calculating the eigenvalue decomposition of the covariance matrix [47,48,49,50]. In other words, it identifies the spectral signatures that represent the variance of the data set. The results of a PCA are commonly discussed in terms of scores and loadings. The scores are the transformed variable values of a particular data point and the loadings represent the numbers by which each original variable should be multiplied to get the score. For a practical analysis, the scores and loadings plots are produced. The scores plot visualizes the scores with respect to the different principal components (PCs). A clustering of the data points in such a plot suggests that they exhibit spectral similarities and hence the corresponding samples can be assigned to a common category. The loadings of the individual PCs, on the other hand, can be plotted as a function wavenumber. The resulting spectra show characteristic signatures that allow a discrimination between the different categories. However, care must be taken when deciding how many PCs are to be considered. If a dataset’s variance is mainly represented by two PCs, the higher components are predominantly noise and, as a consequence, the results may be over-interpreted. The signal-to-noise ratio of the loadings plot is a good indicator to decide whether or not a PC should be included in the analysis.
In the present work, the PCA algorithm implemented in Matlab R2012 was used without initial data centering in order to keep the method as simple as possible.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.