Data analysis was performed using R version 4.0.2 (2020-06-22). All Luminex data were log-transformed, and all features were scaled. Comparisons between vaccination arms were performed using a Mann-Whitney U-test test followed by a Benjamini Hochberg (BH) correction. Antigen responses (such as D614G to Alpha) were compared using the Wilcoxon-signed rank test followed by BH correction. For RBD-specific antibody depletion, all data were Z-scored to visualize and compare differences in pre- and post-depletion functional results, and comparisons between samples were performed using paired t test.
Prior to any multivariate analysis, all data were normalized using Z-scoring. Multivariate classification models were trained to discriminate between individuals vaccinated with BNT162b2 and individuals vaccinated with mRNA-1273 using all the measured antibody responses. A PCA model was constructed using all antibody variables, including antibody titer, FcR measurements, and effector functions. Maximum separation was achieved in the two-dimensional space of PC1 (37.8%) versus PC2 (11.6%). A LASSO-PLSDA model was built using a combination of the least absolute shrinkage and selection operator (LASSO) for feature selection and then classification using partial least square discriminant analysis (PLS-DA) with the LASSO-selected features. Models were calculated and figures were generated using R package “ropls” version 1.20.0 ( 64 ) and “glmnet” version 4.0.2. Specifically, because antibody features are highly correlated (for example, IgG titers typically correlate with antibody effector functions), LASSO first captures the overall correlational structure of the data and identifies clusters of highly correlated features. LASSO then selects a single, or minimal, number of features from each cluster that best captures variation in that data group. The algorithm penalizes the selection of any additional features, aiming to use as few features as possible to define whether multivariate profiles differ across the groups. This reduced feature selection avoids statistical anomalies due to the over-representation of features that track together. Using this minimal set of features that best explains variation in the overall antibody profiles in the sample set, a final set of features is then used to determine whether groups exhibit similar or different profiles using PLSDA classification. LASSO was repeated 100 times, and features selected at least 90 times out of 100 were identified as selected features. A PLS-DA classifier was then applied to the training set using the selected features, and prediction accuracy was recorded. Model accuracy was then further assessed using ten-fold cross-validation. For each test fold, LASSO-based feature selection was performed on logistic regression using the training set for that fold. Selected features were ordered according to their Variable Importance in Projection (VIP) score, and the first two latent variables (LVs) of the PLS-DA model were used to visualize the samples. A co-correlate network analysis was carried out to identify features that highly correlate with the LASSO selected features, and thus are potentially equally important for discriminating the samples from individuals with each vaccination type. Correlations for the co-correlate network were performed using Spearman method followed by a BH correction for multiple comparisons ( 65 ). The co-correlate network was generated using R package “network” version 1.16.0 ( 66 ). All other figures were generated using ggplot2 ( 67 ).
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.