As a first step in the analyses, we examined the item score distributions in order to evaluate the frequency distributions of the scores for each item, namely, whether all the values of the answer scale had been endorsed at least once, and we also assessed the extent of the missing data. Shapiro–Wilk test was performed to verify the normality of the distributions.
We then tested on the total sample through confirmatory factor analysis (CFA) using the weighted least squares with means and variance adjustment (WLSMV) estimator (theta parameterization) whether the hypothesized 3-factor structure (body awareness, supradiaphragmatic reactivity, and subdiaphragmatic reactivity, taking into account that item 41 (“I feel like vomiting”) should load on both the last two factors) was supported by the data at hand. The goodness-of-fit was evaluated using the comparative fit index (CFI), the Tucker–Lewis index (TLI), and the root-mean-square error of approximation (RMSEA), with its 90% confidence interval (CI). We used the following criteria for model fit [54]: TLI and CFI: values ≥ 0.90 indicated acceptable fit, values ≥ 0.95 indicated excellent fit; RMSEA: values ≤ 0.08 indicated acceptable fit, values ≤ 0.06 indicated excellent fit. Missing values were handled by the full information method implemented in Mplus 7 [55], with which we performed the analyses.
In case of inadequate fit of this model, since we would have to investigate the most suitable measurement model for the Italian BPQ-SF without the support of prior knowledge, we decided on a cross-validation approach, i.e., performing an exploratory factor analysis (EFA) on a random split of the sample in order to find a factor structure that could meet the requirements of a simple approximate structure [56], i.e., each item should substantively (>|0.32|; [57]) load on one factor, while negligibly loading on the others), and a CFA on the other random split.
Before performing these analyses, however, we searched for redundancies and items with low squared multiple correlations (SMC) using the total dataset. The former searches for pairs of items whose intercorrelation is too strong. In factor analysis, these items are likely to yield the so-called “bloated specifics” [58], p. 288), i.e., factors of little substantive interest that result from very highly correlated items that usually share very similar content and/or wording. We considered as redundant items those whose intercorrelation was larger than |0.707| (i.e., more than 50% of shared variance). We then computed SMC for all the remaining items. SMC is the proportion of variance shared by an item with all the others, and it is routinely used by EFA software as an estimate of initial communality, i.e., an estimate of the proportion of variance of an item accounted for by the common factors. Items with SMC smaller than 0.10 are unlikely to contribute substantially to the measurement model and can be removed from the item pool [57].
In order to perform EFA on the first random subsample, we first investigate the optimal number of factors to be extracted through dimensionality analyses, i.e., the scree test [59], the parallel analysis (PA, [60]), and computed the minimum average partial (MAP) correlation statistic [61]. The scree test [59] suggests that the optimal number of factors corresponds to the factors before, which the downward curve of the eigenvalues seems to flatten out. Parallel analysis [60] compares the observed eigenvalues to the eigenvalues generated from a simulated matrix of random data of the same size. Based on the recommendations of Buja & Eyuboglu [62], we performed PA on 1000 random correlation matrices obtained through permutation of the raw data, and following Longman and colleagues [63], we considered the 95th percentile random-generated eigenvalues as the threshold values. Velicer [61] proposed that the optimal number of factors is the one at which the average partial correlation of the variables (i.e., the MAP statistic) reaches its minimum after partialling out the factors.
Once determined the optimal number of factors, we could perform the exploratory analyses, always on the first random subsample. We used exploratory structural equation modeling (ESEM, [64]) with WLSMV estimation, theta parameterization, and GEOMIN rotation. ESEM allows for the estimation of all factor loadings (subject to the constraints necessary for identification) and, in general, for an exploration of complex factor structures (similarly to EFA) while allowing access to parameter estimates, standard errors, goodness-of-fit (GOF) statistics, and modeling flexibility (e.g., correlating error variances, obtaining factor scores corrected for measurement error, etc.)—all features that are otherwise commonly associated with CFA. The choice of the final model relied on the GOF indices (using the same criteria described above for the CFA) and the best approximation of a simple structure. As ESEM allows the estimation of the standard errors of loadings, we considered as substantial those loadings whose 95% confidence interval was entirely over the |0.32| threshold.
Once determined a measurement model through ESEM, we used the data from the other random subsample to test its fit using CFA. Together with the obtained factor model, we also tested alternative models. Two parsimonious models, such as a single factor model and an independent-factor model, and a bifactor model, i.e., a model where the items loaded on general body awareness and reactivity factor, and on specific factors, allowed us to examine the reliability of the total score of the BPQ-22. Besides Cronbach’s alpha, we computed the indices suggested by Rodriguez and colleagues [65] to test whether the single factor score could be considered as sufficiently reliable to be used along with subscale scores. We thus calculated the omega hierarchical coefficient, the explained common variance (ECV), the proportion of items with a relative bias (i.e., the absolute difference between an item’s loading in the unidimensional solution and its general factor loading in the bifactor model, divided by the general factor loading), and the percentage of uncontaminated correlations (PUC, i.e., the number of correlations between items from different group factors divided by the total number of correlations). Support for the use of the total score despite the presence of a multidimensional factor structure is advised if a threshold of 0.80 for the omega hierarchical, and 0.70 for the ECV and the PUC is met [66], and if the proportion of items with a relative bias does not exceed 15% [67].
Construct validity was investigated by computing Spearman correlation coefficients among the observed scores of the BPQ-22 and the other measures administered to the first sample of participants.
The association of the BPQ-22 scores with background variables was tested by specifying a general linear model that included as predictors sex, age, years of education, relationship status, and occupational status. Dunn’s post hoc comparisons with adjustment for false discovery rate were used to test differences between groups in significant categorical predictors.
Finally, we tested the test–retest reliability of the scales on the second sample of participants. The retest coefficient was computed as the Spearman correlation of observed scores at time 1 and time 2, while the stability of scores was evaluated through a Wilcoxon signed-rank test. In order to find evidence for adequate stability of scores, we expected retest coefficients larger than 0.70 (i.e., at least 50% of shared variance) and negligible or low effect sizes (i.e., r < 0.30) for the Wilcoxon signed-rank test.
Wherever possible, we computed and reported 95% confidence intervals and measures of effect size. For the data analyses we used IBM® SPSS® 27 software.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.