ask Ask a question
Favorite

To analyze the psychometric quality of the Spanish adaptation of the SSKJ 3-8, we examined the measurement invariance in the context of Item Response Theory (IRT). The IRT framework offers the possibility to test for the highest level of measurement equivalence, which is according to van de Vijver and Tanzer (2004) termed as scalar equivalence or full scale equivalence. This level of equivalence includes structural equivalence, which is in the area of cross-cultural psychology according to Triandis and Marín (1983) in turn associated with an “etic” position. Against the backdrop of the rather small sample sizes available in these analyses, we choose the package pairwise (Heine, 2019) for the open source statistical environment R (R Core Team, 2019) for all IRT scaling tasks. The package pairwise offers a non-iterative method for item parameter calibration named PAIR (see e.g., Choppin, 1968; Fischer, 1970; Fischer and Scheiblechner, 1970a; Wright and Masters, 1982), which according to Heine and Tarnai (2015) returns stable item parameter estimates even based on rather small sample sizes (see also Heine et al., 2018 for an practical application). Subsequent to the process of item calibration, person parameters estimation can base on the weighted likelihood approach introduced by Warm (1989).

In order to test hypothesis 1, we applied several one-dimensional scaling approaches based on the Partial-Credit-Model (PCM – Masters, 1982) for polytomous item answer scales. We analyzed each of the five scales based on the PCM (Masters, 1982), applying a one-dimensional scaling approach. On the one hand, the items of the five scales (i.e., five emotion-regulation strategies) were included across both stressful situations (i.e., academic and social situation). Thus, 12 items were included for each scale. On the other hand, we divided the items sets according to the two situations (i.e., six items for each of the five scales per situation) and analyzed the resulting scales separately in a one-dimensional scaling approach.

In the first scaling approach, the global model fit was evaluated by applying the Andersen Likelihood Ratio Test (Andersen, 1973) on the 12 items for each of the five scales across both stressful situations, using the splitting criterion of the cultural context (i.e., Germany – Chile). Additionally, we report the residual based model fit statistics Q3 (see Yen, 1984), based on the separate scaling approaches for each of the five scales and situations, respectively. To analyze any local model violations on item level, we calculated root-mean-square statistics (INFIT and OUTFIT – e.g., Wright and Masters, 1982). Based on the fixed item parameter estimates resulting from the concurrent calibration approach across both cultural contexts (i.e., based on the total sample), the INFIT and OUTFIT statistics were calculated and evaluated separately for each subsample (i.e., for each cultural context). To detect any sub-dimensionality of the five scales, possibly resulting from the two stressful situations (social situation vs. academic situation), we performed a Rasch-Residual-Factor-Analysis (RRFA – Wright, 1996; Linacre, 1998). Therefore, we calculated the Rasch-Residual matrices, resulting from the one-dimensional scaling across the 12 items for each of the five scales and analyzed them applying a principal component analysis (Wright, 1996). We plot the item difficulty against the loadings on the first main component to examine if there is any clear assignment of the respective item residuals to the two stressful situations.

In the second scaling approach, the items were scaled separately for the two situations (social situation vs. academic situation, A vs. B) and five dimensions. Thus, we applied a one-dimensional scaling model (PCM) for each of the ten resulting scales. For each of the ten scales (six items per scale), we computed the residual based model fit statistics Q3 (see Yen, 1984) to evaluate the global model fit. To detect any local model deviations we computed INFIT and OUTFIT statistics for each of the ten scales.

To examine the measurement invariance across both cultural contexts, the Fischer-Scheiblechner Z-test (Fischer and Scheiblechner, 1970b; van den Wollenberg, 1982) was performed for all ten scales to test for DIF between sub-samples (Germany – Chile).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A