The type of ToM task was identified according to the employed concept and the method of administration. The final 117 records were qualitatively analyzed with the following extracted variables: embedded ToM concept, employed construct, number of included studies, task content, presentation modality, answer mode, inclusion of control questions or items, scoring, and psychometric properties. The psychometric properties were further evaluated using the criteria proposed in previous research. More specifically, the internal consistency, test–retest reliability, unidimensionality, convergent validity, criterion validity, and ecological validity were evaluated using the criteria proposed by the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) [37]. Known-group validity was evaluated using the criteria proposed by Brown and Subel [38]. Internal responsiveness was evaluated using the criteria proposed by Husted et al. [39]. Table 1 summarizes all the psychometric testing with relevant criteria used in the present systematic review. To explain the trustworthiness of the results, the methodological qualities of the included studies were assessed using the COSMIN Risk of Bias Checklist [40].

Criteria for evaluating the qualities of psychometric properties of current ToM tasks.

Note: CFA = confirmatory factor analysis; the criteria marked with * applied COSMIN [37].

