All analyses were conducted using STATA 16 (StataCorp, 2019). We used previously recommended cut-offs for the validated scales (Kroenke et al., 2001; Spitzer et al., 2006) – a PHQ-9 or GAD-7 score of 10 or more was used to indicate moderate/severe depression or anxiety symptoms. For self-harm/suicidal ideation, self-harm, and abuse, a response that indicated at least one occasion of these in the previous week was recorded as an experience of these thoughts/events. For each day of the study period (21/03/2020-21/08/2020) the proportion of responses indicating depression, anxiety, self-harm/suicidal ideation, self-harm, and abuse were generated. The mean score of the UCLA-3 scale (loneliness measure) was calculated for each day. The questions related to self-harm, and abuse were only collected from the 30/03/2020 onwards.
We provide graphical presentations of Google Trends topic RSVs and self-reported measures of mental and social distress. Our analysis aimed to estimate the temporal association of one time series on another. As we did not have data on suicide deaths and attempts during this period, we investigated associations between Google searches for suicide with self-reported self-harm and self-harm/suicidal ideation. Given the low likelihood that an individual's Google searching for a mental or social distress term will result in them developing or experiencing distress, we assumed that the development of symptoms or experience of abuse would predate Google search behaviour and not occur simultaneously. The only exceptions to this might be Google searches for terms related to suicidal ideation, self-harm and suicide which may precede self-reported self-harm. We used vector autoregressive (VAR) to test whether there was evidence that one time series temporally preceded another. These models account for autocorrelation, and allow for lags in effect (Becketti, 2013). We observed a day of the week effect in Google searches for topics, and therefore added dummy variable in for day of the week as an exogenous variable to account for this. We estimated two-variable VARs each using the seven self-reported mental/social distress data with the corresponding Google Trends time series. For each VAR we needed to select the number of lags to estimate our models. We did this by selecting the best fitting model by testing out a range of lag lengths by using the varsoc command in STATA and used the Akaike's information criterion to select the number of lags to estimate the VAR models (see Table 1). VAR models were fitted using the var command. All models were checked for stationarity. We used the Granger causality test to assess whether the self-reported time series predict Google Trend values for the corresponding mental/social distress topics. In addition, given the possibility that Google searches for topics related to suicidal ideation might precede self-harming behaviour we also tested for this using a Granger causality test.
Granger causality test results for the association between self-reported and Google searching time trends data
Given the age patterning of mental/social distress and internet use, we also provide graphical presentations of Google Trend topic RSVs and self-reported measures of mental and social distress stratified by age group (18–29; 30–59; 60+).
All self-report prevalence estimates and scores were calculated without weighting for response probability in the primary analysis. As a sensitivity analysis we repeated all analyses with weighted data to check the robustness of our findings. The sample was weighted by the proportion of gender, age, ethnicity, and education obtained from the Office for National Statistics, UK (Fancourt et al., 2021).
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.