Statistics

Paulina Kaplonek; Deniz Cizmeci; Stephanie Fischinger; Todd Suscovich; Caitlyn Linde; Thomas Broge; Colin Mann; Fatima Amanat; Diana Dayal; Justin Rhee; Michael de St. Aubin; Eric J. Nilles; Elon R. Musk; Erica Ollmann Saphire; Florian Krammer; Douglas A. Lauffenburger; Dan H. Barouch

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Statistics

PK Paulina Kaplonek

DC Deniz Cizmeci

SF Stephanie Fischinger

TS Todd Suscovich

CL Caitlyn Linde

TB Thomas Broge

CM Colin Mann

FA Fatima Amanat

DD Diana Dayal

JR Justin Rhee

MA Michael de St. Aubin

EN Eric J. Nilles

EM Elon R. Musk

ES Erica Ollmann Saphire

FK Florian Krammer

DL Douglas A. Lauffenburger

DB Dan H. Barouch

This method is extracted from research article: Sci Transl Med, Mar 2022

mRNA-1273 and BNT162b2 COVID-19 vaccines elicit antibodies with differences in Fc-mediated effector functions

DOI: 10.1126/scitranslmed.abm2311

Ask a question

Favorite

Data analysis was performed using R version 4.0.2 (2020-06-22). All Luminex data were log-transformed, and all features were scaled. Comparisons between vaccination arms were performed using a Mann-Whitney U-test test followed by a Benjamini Hochberg (BH) correction. Antigen responses (such as D614G to Alpha) were compared using the Wilcoxon-signed rank test followed by BH correction. For RBD-specific antibody depletion, all data were Z-scored to visualize and compare differences in pre- and post-depletion functional results, and comparisons between samples were performed using paired t test.

Prior to any multivariate analysis, all data were normalized using Z-scoring. Multivariate classification models were trained to discriminate between individuals vaccinated with BNT162b2 and individuals vaccinated with mRNA-1273 using all the measured antibody responses. A PCA model was constructed using all antibody variables, including antibody titer, FcR measurements, and effector functions. Maximum separation was achieved in the two-dimensional space of PC1 (37.8%) versus PC2 (11.6%). A LASSO-PLSDA model was built using a combination of the least absolute shrinkage and selection operator (LASSO) for feature selection and then classification using partial least square discriminant analysis (PLS-DA) with the LASSO-selected features. Models were calculated and figures were generated using R package “ropls” version 1.20.0 ( 64 ) and “glmnet” version 4.0.2. Specifically, because antibody features are highly correlated (for example, IgG titers typically correlate with antibody effector functions), LASSO first captures the overall correlational structure of the data and identifies clusters of highly correlated features. LASSO then selects a single, or minimal, number of features from each cluster that best captures variation in that data group. The algorithm penalizes the selection of any additional features, aiming to use as few features as possible to define whether multivariate profiles differ across the groups. This reduced feature selection avoids statistical anomalies due to the over-representation of features that track together. Using this minimal set of features that best explains variation in the overall antibody profiles in the sample set, a final set of features is then used to determine whether groups exhibit similar or different profiles using PLSDA classification. LASSO was repeated 100 times, and features selected at least 90 times out of 100 were identified as selected features. A PLS-DA classifier was then applied to the training set using the selected features, and prediction accuracy was recorded. Model accuracy was then further assessed using ten-fold cross-validation. For each test fold, LASSO-based feature selection was performed on logistic regression using the training set for that fold. Selected features were ordered according to their Variable Importance in Projection (VIP) score, and the first two latent variables (LVs) of the PLS-DA model were used to visualize the samples. A co-correlate network analysis was carried out to identify features that highly correlate with the LASSO selected features, and thus are potentially equally important for discriminating the samples from individuals with each vaccination type. Correlations for the co-correlate network were performed using Spearman method followed by a BH correction for multiple comparisons ( 65 ). The co-correlate network was generated using R package “network” version 1.16.0 ( 66 ). All other figures were generated using ggplot2 ( 67 ).

This is an open-access article distributed under the terms of the Creative Commons Attribution license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol