Statistical Analysis

MI Maria J. Iglesias
LK Larissa D. Kruse
LS Laura Sanchez-Rivera
LE Linnea Enge
PD Philip Dusart
MH Mun-Gwan Hong
MU Mathias Uhlén
TR Thomas Renné
JS Jochen M. Schwenk
GB Göran Bergstrom
JO Jacob Odeberg
LB Lynn M. Butler
ask Ask a question
Favorite

Median fluorescence Intensity obtained as readout on the FlexMap 3D instrument (Luminex Corp) was processed and visualized in the R statistical computing software (v 3.1.2 and 3.5.1, respectively) unless stated otherwise. A minimum of at least 32 beads per antibody/bead region was required for inclusion in the analysis. Outlier samples were identified by robust principal component analysis and excluded from further analysis.41 Median fluorescence intensity data were normalized by (1) probabilistic quotient normalization as accounting for any potential sample dilution effects42 and (2) multidimensional MA (M=log ratio; A=mean average, scales) normalization to minimize the difference amount the subgroups of the samples generated by experimental factor as multiple batches.43 Log-transformation was applied to reduce right-skewness in the proteomic data distribution. To identify differences in protein profiles and the association with CVD risk factors, we applied linear regression analysis for each antibody, adjusting for age and sex, to determine association with variable of interest (eg, BMI, hypertension, diabetes). Analysis was performed on the whole cohort (N=1005) and on sex-stratified subgroups ([female n=507, male n=498]). Protein candidates were denoted as associated with a CVD risk factor in all when (1) Bonferroni corrected (P<0.05/216=2.31×10-4) in the full analysis and (2) P<0.05 in both sex-stratified subgroups. Protein candidates were denoted as predominantly associated with a CVD risk factor in females or males when (1) sex-risk factor interaction was significant (P<0.05; see Table I, Tab_3, Table B in the Data Supplement), and (2) there was an association in one sex (P<0.01), and (3) there was no association with the same risk factor in the other sex (P>0.05; see Table I, Tab_3, Table A in the Data Supplement). All associations were tested using linear regression.

CVD risk factors were analyzed as continuous variables or categorized according to the Framingham study risk score tables. FRS for each subject was calculated based on the previously described formula,44 where the following information was required: age (years), sex (male/female), total cholesterol (CHO; mg/dL), HDL (high-density lipoprotein) cholesterol (mg/dL), current smoking (yes/no), antihypertensive treatment (yes/no), diabetes (yes/no), and physician-acquired (clinic) systolic blood pressure (mm Hg). Associations between individual proteins and FRS were determined by linear regression analysis. To investigate the relationship between FRS and plasma protein profile, model selection was performed by a bidirectional stepwise algorithm, based on the significance P values. Analysis was done in R, version 4.1.0 (R Core Team 2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.) Model selection algorithm was done with the “losrr” package (Aravind Hebbali (2020). olsrr: Tools for Building OLS Regression Models. R package version 0.5.3.).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A