Sampling weights were constructed as the inverse of the predicted probability of a child being included in the study, so that when weighted, the estimates obtained from the sample are estimates for the whole population. Predicted probabilities of inclusion were estimated via two logistic models; the first logistic model was fitted in the entire population recruited to Phase 1 and included covariates predictive of inclusion due to study design. These were total number of pupils assessed per school and whether the child was identified as having high risk of language impairment based on CCC‐S teacher ratings (86th centile or above for sex and age group). The second logistic model was fitted only to children completing the second phase of the study. Covariates were tested in a stepwise elimination process. These were factors predictive of inclusion due to individual characteristics of the participants, such as sex, age group, IDACI rank score, English as an additional language and CCC‐S total raw score; and school‐level factors such number of pupils on role, percentage girls, percentage with identified special education needs and percentage receiving free school meals (a measure of school‐level deprivation). The final weights were a multiplication of the inverse of the predicted probabilities from the two models.
Given that many core language tests did not have current or valid UK standardisations, all language and nonverbal composites were standardised using the LMS method (Cole & Green, 1992). Z‐scores were calculated using a box‐cox (Box & Cox, 1964) type of transformation, whose parameters are estimated via penalised maximum likelihood. Moreover, the mathematical relationship between z‐scores and percentiles allows for the construction of smoothed centile curves across the entire distribution of a measure, similar to centile curves used in paediatric height and weight charts (G. Vamvakas, C.F. Norbury, S. Vitoratou, D. Gooch, & A. Pickles, under review).
Complete data on the language composites existed for 529/636 children for Year 1 and 499/529 for Year 3. No imputation was performed, but sampling weights take into account these missing observations. All available covariates that influence the ‘missingness’ indicator were used in order to maximise the likelihood of the data being missing at random.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.