In contrast to content validity which is more concerned with having the breadth and accuracy of items to measure a construct, face validity assesses the degree of respondents judging that the instrument and its items are appropriate for the targeted assessment (45). For an experience measure to provide useful and valuable information, it must first be considered acceptable by service users within the context it is implemented. We used completion rates to assess how acceptable the measurement was within the online community and compared drop-out effects at each stage of the three-part assessment measure. The demographic differences were analysed between the assessment of structure scores, and the assessment of structure scores were compared between different outcomes and mechanisms. During a 10-week pilot we collected qualitative and quantitative data from the users in the online community completing the measure inside the platform (11–25-year-old service users).
The measure was iteratively released onto the service's platform. Online service users who either contributed to a forum or submitted an article during the testing period were presented with the contributors' measure after submission. Service users who read an article or forum were presented with the readers measure at the end of the post. Users who did not provide research consent during sign-up to the service were excluded from analysis. Data was collected between the 13 November 2019 and the 22 January 2020.
The clickable prototype of the POCEM was implemented as a feature for service improvement in the online community at the service's platform, changes from the previous Phase 2 were included in the measure for pilot. For a period of 10 weeks the measure was tested within the platform and data was collected on users engaging with the community at the digital mental health service. Routinely collected monitoring information was used alongside peer support data to investigate the measure performance. All users accessing the platform community were able to complete and see the measure during the 10-week period.
Frequencies and descriptive analysis were carried out on completion rates for users who accessed the online community and those who completed the measure. Descriptive statistics and frequency of selection on the three steps of the measure were calculated to understand if items were being selected sufficiently. As POCEM measurement is divided in three assessments that interrelate, different analytical approaches were taken for each section of the measure. For the assessment of structure, the helpfulness score, Kruskal-Wallis non-parametric test alongside Dwass-Steel-Critchlow-Fligner pairwise comparisons post-hoc tests were used to ascertain differences in demographic variables (age, gender, ethnicity) on the score. Further analysis then explored the type of community interactions (whether the respondent was a reader or a contributor), using a two-sample t-test.
For the POCEM process assessment, we explored the effect of the domain selection on the score through Kruskal-Wallis, post-hoc analysis using Dwass-Steel-Critchlow-Fligner pairwise comparisons were performed looking at the average helpfulness scores for each four domains of support, and the average score for respondents who dropped-out at this step.
The POCEM outcome assessment was explored looking at the differences in scores based on the outcome selected in the measure. The aim of the pilot analysis was to see POCEM acceptability by service users using completion rates and whether the phased design resulted in a drop-off of respondents. We also explored the outcomes and processes more frequently selected by users of the measure while in the community.
The measure was tested between the 11 of December 2019 and the 20 of January 2020, with 2,140 unique service users completing a total of 4,897 administrations POCEM. There was a total of 68,439 views of community content on the site by service users who gave research consent during this time, and a total of 2,425 contributions in the form of article or discussion posts in the online forum community. Completions rates were divided between readers and contributors to better understand overall completion of the instrument across community members (Table 6).
The unique users, POCEM completions, and proportion of completions within the community.
The respondents ages ranged from 10 to 25, the range allowed in the community and the service. However, five respondents reported an age over 25 and were removed from the dataset, as these will be outliers of the service. The remaining respondents ages ranged from 10 to 25, with a mean of 13.47 (SD = 2.09). Most service users completing the POCEM were female, white, and aged between 10 and 14 years (Table 7).
Demographic characteristics and frequencies of unique users completing POCEMs.
The most frequently selected helpfulness score was 5: “Loads!”, indicating that the content helped the service user considerably. The frequency with the rating of 1: “Not really” was selected the least frequently. The mean helpfulness score was 3.77 (SD = 1.14).
Demographic differences were analysed to investigate whether the POCEM showed different experiences of structure between service users.
There were no significant differences between different genders [H(3) = 2.4, p = .40], or between different ethnicities [H(4) = 8.4, p = .07]. The age had a small significant effect on the perceived helpfulness scores, with a Kruskal-Wallis test showing a significant effect of age on helpfulness score [H(2) = 7.89, p = .02]. Post-hoc tests using Dwass-Steel-Critchlow-Fligner pairwise comparisons were carried out for three groups and showed service users aged 10–14 gave a significantly (p = .03) higher helpfulness score (M = 3.8, SD = 1.13) compared to service users aged 15–19 (M = 3.7, SD = 1.5). There was no difference (p = 0.4) between service users aged 10–14 and aged 20–25 (M = 3.4, SD= 1.43), or between service users aged 15–19 (M = 3.68, SD = 1.15) and aged 20–15 (p = .80).
For the role as a member of the community, T-test frequency comparisons showed a statistically significant difference in the mean helpfulness score [t(247) = 8.8, p < .001] between readers and contributors. Service users who completed the POCEM after contributing to the community content selected the helpfulness score of 5:'Loads!' substantially more frequently than service users who read the community content (Figure 5).
Frequency of selection across the five helpfulness scores for each type of engagement.
Out of the 4,897 completions of the measure, 14.2% of responses gave 3: “Don’t know” as the helpfulness score. For this score response, the rest of the measure was not shown, and responses (n = 619) were removed.
As seen in Figure 6, the most frequently selected high-level domain of support was “Help me relate to others”, with 55.1% of respondents selecting the option. Across respondents who gave positive feedback more than half (58.2%) selected the domain from the process assessment “Help me relate to others”. When looking at respondents who gave negative feedback, 32.3% selected the “Help me relate to others” (Emotional-Interpersonal) domain of support.
POCEM selection frequency of outcomes for each domain. Each panel shows the proportion of selection for outcomes selected by High-level domain of support selection: (A) “Important to me”; (B) “Relate to others”; (C) “Understand myself”; (D) “Learn skills”.
Frequency of selection for high-level support domains in process assessment of the measure for each type of feedback.
Out of the 4,278 responses with a score positive or negative score (1,2,4, or 5), 10.05% of respondents stopped answering the measure after providing a helpfulness score. When splitting the responses by positive or negative feedback, 34.5% of those giving negative scores did not answer next process assessment part of the measure and dropped out. Comparatively, out of all respondents who gave a positive response, only 6.8% dropped out of the measure without selecting a process domain.
An analysis of the helpfulness scores for process support domains gives the same message as the frequency findings, with respondents who dropped out of the measure giving a lower average score. A Kruskal-Wallis test showed a significant effect of process domain selection on helpfulness score [H(4) = 207.45, p < .001]. Post-hoc tests reveal that there was no significant difference between the helpfulness scores on the domains of POCEM process assessment. The post-hoc analysis revealed more about this difference and showed service users who did not give a response (“No response”) gave a significantly lower helpfulness score (m = 3.28), compared to those who selected the other domains “Important to me” (M = 4.29, p < .001), “Learn skills” (M = 4.09, p < .001), “Relate to others” (M = 4.31 p < .001), and “Understand myself” (M = 4.28, p < .001) (Table 8).
Dwass-Steel-Critchlow-Fligner pairwise comparisons for domain helpfulness scores.
The analysis was run after removing cases where respondents did not answer the process domain selection phase (n = 430) as drop-out no scores were recorded in the administration. The frequency of outcome selection was analysed for each of the process domains, as the outcome items shown to respondents was dependent on the earlier selection. For all but “Relate to others”, the most frequent action from respondents was to drop-out of the measure, making up 20% or more of the responses in each domain (see Figures 7, graphs A–D). When looking across all the outcome responses, dropping out of the measure after the process question accounted for 25.38% of the sample who reached this stage of the POCEM. This is a higher drop-out rate compared to the 10.05% of respondents who dropped out at the previous stage.
For the domain “Important to me” the outcome item selected most frequently was “Others have the same experience” (18.9%), for the domain “Understand myself” the item “Felt accepted” was most selected (19.5%), and for the “Learn Skills” domain the most selected item was “Skills to help others” (17.9%). The process domain “Related to others” had the item “Felt connection” selected the most frequently (20.7%) (Figure 7, graph B). The item “I now feel able to ask for support outside of Kooth” was selected the least frequently out of the total items, with only 4.27% of respondents selecting the “Emotional-Intrapersonal” domain choosing the outcome (Figure 7, graph C).
The pattern of lower helpfulness scores for respondents who dropped out of the measure before completion continued for the outcome item stage. The Kruskal-Wallis test showed a statistically significant difference in helpfulness score between outcomes [H(22) = 407, p < .001], and the post-hoc test showed a significant difference in the helpfulness score when respondents dropped out before answering the outcome assessment stage, compared to those selecting an outcome in the instrument. There were no significant differences between other selection of outcomes (Supplementary Table SA2).
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.