The raw dataset obtained by HILIC UHPLC-Q-TOF MS was converted into common data format (.mzXML) files using Proteo Wizard software (Palo Alto, CA, USA). The XCMS program was then used for peak alignment, retention time correction, and peak area extraction. The identification of the metabolite structure adopted the accurate mass matching (<25 ppm) and secondary spectrum matching methods by searching the self-built database of the laboratory.
For the dataset extracted by XCMS, ion peaks with missing values >50% within the group were deleted. Multi-dimensional statistical analyses, including unsupervised principal component analysis (PCA), Partial Least Squares Discrimination Analysis (PLS-DA), and Orthogonal Bias Discriminant Analysis of Least Squares (OPLS-DA), were performed by means of SIMCA-P 14.1 (Umetrics, Umea, Sweden). The model parameters R2Y and Q2Y were inspected to check the goodness of the prediction model. In addition, variable importance in projection (VIP) analysis was conducted on the processed data using the standard algorithms in the PLS-DA toolbox. Metabolites with a VIP value >1 were regarded as the most influential factors in the extracted PLS-DA models. A volcano plot combining fold change analysis (FC > 2) and the t-test (p < 0.05) was performed, considering the UHPLC-Q-TOF MS data to provide the up/down accumulation trends of the discriminant markers gained by multivariate statistics. Finally, the related metabolic pathways were analyzed based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. The basic significance statistics were carried out by one-way analysis of variance (ANOVA) with SPSS software (Version 19.0; Chicago, IL, USA). Differences were considered statistically significant at p < 0.05.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.