Statistical analysis

CS Christian G. G. Schwab
MD Martin Nikolaus Dichter
MB Martin Berwig
ask Ask a question
Favorite

The descriptions of the participants, missing data, and item distribution were conducted using descriptive statistics. The analysis of the item difficulty was based on the proportion of responses endorsing the best and worst ratings (i.e., ceiling/floor effects). A corresponding effect was assumed conservatively, as long as the mean value of the item was in the lower end (upper 20%) of the respective item range.

For the analysis of the structural validity of the DEMQOL and DEMQOL-Proxy, as part of the construct validity, we used the MSA. The MSA is a useful tool for researchers who wish to construct unidimensional tests or use questionnaires that comprise multiple binary or polytomous items and enable the examination of reliability without the use of Cronbach’s alpha [29]. The MSA is a method of the non-parametric item response theory originating from assumptions of the unidimensionality of tests or scales, local independency and monotonicity [30]. The method is established in the context of scale development and has been widely used in QoL research [31, 32].

The MSA provides additional information about the relationship between items. As an indicator of the internal correlation of each subscale, the MSA uses Loevinger’s H coefficient (HS). According to Sijtsma and Molenaar [33], the following interpretation of HS scores was applied to describe the scale: > 0.5 = “strong”, > 0.4 = “medium”, and >  0.3 = “weak”. The correlation of a single item to the other items of the scale is expressed by the value Hi. The Hi should be non-negative for the Mokken model to hold. Depending on the source, an Hi-value from 0 to 0.55 is recommended. We have fixed the Hi-value to the typically used minimum of > 0.3. Items that fall below this level have weak discrimination power and are not useful for this scale. The Hij designated the coefficient of two items.

The criterion (Crit) of the MSA by Molenaar and Sijtsma [34] was used to identify items that partially satisfy the assumptions for monotonous homogeneity or double monotonicity. For each item, this diagnostic value combines the H coefficient, the frequency and size of the violations, and their significance. Every item should have a Crit value of less than 40, and optimally a Crit value of 0. A Crit value of greater than 80 displays a strong indication that an item has violated the assumption for the MSA in this subscale. The critical values were calculated separately for each of the ten imputed records (see below). As a result, individual injuries of double monotonicity should not be systematically increased by a factor of ten.

For the exploratory investigation of the instruments, a method of parallel iteration was used, which consists of two steps. In the first step, cores were determined. The cores are items in an item pool that are strongly correlated with each other (Hij) and are less correlated with the other cores when examined as a dyad. This finding means that other items from the pool could improve the HS value in a similar way that a second core could. Otherwise, the weaker second core was returned into the item pool as single items. As a strong correlation for a core, we defined a minimum Hij-value of 0.45 as a reference according to Müller-Schneider [35]. Analogous to the procedure in a factor analysis, the number of subscales is thereby predefined. In the second step, an iterative MSA was performed in parallel for each core determined. All items were tested in parallel. Accordingly, all remaining items would be tested against any core, and the item with the highest Hi-value to a specific core would be chosen. In doing so, the assignment of an item should not lead to a violation of monotonicity (Crit > 40). This procedure was used with regard to content in the case of a possible allocation to two different cores (cross loader). The search procedure stopped when there were no further items that fulfilled the requirements (Hi ≥ 0.3 or Crit > 40) or when all the items had been incorporated into a scale. This method of parallel iterative analysis allowed for the identification of smaller subscales with higher HS values, as opposed to the Automated Item Selection Procedure (AISP) for MSA [36].

As a precondition, the MSA assumes only complete cases and integers as values. Therefore, missing values may have to be imputed. In the case of instrument testing, however, the imputation of missing data should be performed with caution. We used a two-way imputation, which is a Bayesian method for estimating missing data in tests and questionnaires [37].

The internal consistency was assessed with the coefficient rho (ρ) of the Molenaar Sijtsma statistic. The ρ coefficient is not as prone to bias as Cronbach’s α and should therefore be preferentially used [38, 39]. For comparison purposes, we also calculated Cronbach’s alpha (α). Values for a ρ or α between 0.70 and 0.95 indicated “good” internal consistency [40]. Finally, we conducted a part-whole-corrected item-total correlation (rit) calculation. For this purpose, the coefficient for the item to be examined against the scale without this item was computed. Items with rit coefficients > 0.5 reflected a “high” correlation, and those with rit >  0.3 reflected a “moderate” correlation [41]. An rit correlation of 0.3 and below indicated that the item did not correlate well with the scale as the item may not measuring the same construct as the other variables.

All analysis were performed using R environment for statistical computing version 3.4.1 (30.06.2017) [42]. The following packages were used for the MSA and the imputation, respectively: “mokken” version 2.8.6 [43, 44] and “miceadds” version 2.5–9 [45].

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A