The teacher data were imported from TIMSS international database2, prepared using the IDB Analyzer Version 4.0, and further analyzed with the statistical software Mplus 7.4 (Muthén and Muthén, 1998-2018). The rate of missing data ranged from 9.7% to 15.8% at the level of item responses, and the full information maximum-likelihood estimation was used to handle the missingness (Enders, 2010). To correct standard errors in the presence of missing data and possible deviations from normality, the robust maximum-likelihood estimator was used. All model comparisons involving chi-square statistics are therefore corrected according to Satorra and Bentler (2010) procedure. Furthermore, we used the TYPE = COMPLEX option to take into account the nesting of the teacher data in schools (Muthén and Muthén, 1998-2018).
The data analysis focused on (a) establishing measurement models to represent general and specific CAS in science teaching, teacher self-efficacy, and perceived time constraints; (b) examining the relations among these constructs for the full sample; and (c) examining the relations among these constructs across grade levels. To accomplish (a), we performed explanatory factor analysis (EFA) and confirmatory factor analysis (CFA). For each construct, we employed EFA to examine the items that were related to the construct and inspected their underlying dimensions. Next, we conducted CFA to verify the underlying dimensions of the construct and, ultimately, obtain information about the model fit to the data. For each construct, we specified a measurement model that reflected our theoretical assumptions on the constructs, first for the total sample and then for the samples of students in Grades 4, 5, 8, and 9. The second step was taken to ensure that each measurement model formed an appropriate baseline and construct representation in each grade. We evaluated the model fit using common goodness-of-fit indices and their guidelines for an acceptable fit [root mean square error of approximation (RMSEA) ≤ 0.08, comparative fit index (CFI) ≥ 0.95, Tucker-Lewis index (TLI) ≥ 0.95, and standardized root mean square residual (SRMR) ≤ 0.10; Marsh et al., 2005]. Notice that these guidelines do not represent “golden rules” as they depend on the specific features of the measurement models, such as the number of factors, the type of factor structure, and the sample size (Marsh et al., 2004).
Based on the measurement models established in the previous steps, we performed structural equation modeling to examine the relations among the latent variables, both for the full sample and for the sample across grades. Further, we controlled for teachers’ gender, years of teaching experience, and educational level as these variables have shown to be significantly related to teachers’ self-efficacy (e.g., Klassen and Chiu, 2010; Tuchman and Isaacs, 2011) by adding them as covariates of teachers’ self-efficacy construct. For the full sample, we began with specifying the relations between teacher self-efficacy and CAS in science teaching and then added perceived time constraints to the structural model. Prior to investigating the differential relations of the constructs between grades, it was essential to assess the invariance of the measurement models across grade levels by applying multi-group CFA to accomplish this (Sass and Schmitt, 2013; Greiff and Scherer, 2018). We started with the model that assumed the same factor structure across grade levels, yet without equality constraints of the model parameters (configural invariance) and then constrained the factor loadings (metric invariance) to be equal across grades. If at least metric invariance was obtained (i.e., teachers interpreted the constructs similarly across grade levels), we tested whether the relations between the constructs were equal across grades (structural or relational invariance). For comparing the freely estimated with the constrained models in the measurement and structural invariance testing, we used the Satorra–Bentler corrected chi-square difference test (SB-χ2, Satorra and Bentler, 2010) and/or the differences in fit indices (ΔCFI ≥ −0.01, ΔRMSEA ≥ 0.014, and ΔSRMR ≥ 0.015 as evidence of non-invariance; Chen, 2007). Under the condition of unequal structural relations across grades, we further performed the Wald test of parameter constraints to test the specific differences in the relations between pairs of grade levels (Brown, 2015).
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.