Our ultimate objective was to determine an overall patient-level classification of housing status. Following the classification of a set of documents for a Veteran, the next step was to aggregate the extracted information to a patient level. Determining a patient-level classification was complicated by the fact that Veterans could have several documents in a close window of time with conflicting information about their housing status. This variation may be due to inconsistent documentation, actual change in housing between visits, or NLP errors.
While there may be some variation between notes, we hypothesized that Veterans who were in stable housing during a period of time would have a higher proportion of notes documenting that they were stably housed. This statistic could then act as a measurement of overall housing stability and be used for patient- and population-level analysis.
We defined a patient-level measurement of housing stability as the proportion of documents classified by our NLP system as “Stably Housed” over a 30-day window (Eq. 1). We refer to this measurement as Relative Housing Stability in Electronic Documentation (ReHouSED). We limit the notes used in this measure to notes classified either as “Stably Housed” or “Unstably Housed”, excluding notes which were classified as “Unknown” or did not contain any relevant housing keywords.
Eq 1. Formula for calculating a Veteran’s ReHouSED score over a 30-day time window.
To evaluate this method, we compared ReHouSED scores for a sample of Veterans before and after treatment in SSVF. As a reference standard, we used the housing destination variable recorded in HMIS data upon a Veteran’s exit from the SSVF program. We assumed that a Veteran maintained the same housing for at least 30 days after discharge from SSVF.
To construct a sample of Veterans and documents for evaluating ReHouSED, we first identified a group of 10,328 Veterans from our testing set. Each Veteran had exactly 1 SSVF episode and had received assistance for rapid rehousing. We then retrieved notes for these Veterans from VA CDW using the keywords described previously. In order to compare text documents before and after SSVF treatment, we retrieved notes which were authored in one of two 30-day time intervals: 60–90 days before exiting SSVF (“pre-SSVF”) and 0–30 after (“post-SSVF”). Each of these notes were processed by the NLP system and assigned a document classification using the steps described above. Any document classified as “Unknown” was excluded from further analysis. Finally, we excluded any Veteran who did not have at least one “Stably Housed” or “Unstably Housed” document in both the pre-SSVF and post-SSVF time intervals.
ReHouSED scores calculated before and after SSVF were compared between the Veterans who exited the SSVF program to stable housing (N=3,279) and those who exited to unstable housing (N=415) to evaluate whether this metric could potentially be used to distinguish between these twohousing situations. The median and interquartile range (IQR) of ReHouSED scores were calculated within each group. Mann-Whitney rank tests were performed in each cross-group comparison to assess statistical significance between stable and unstably housed groups both before and after SSVF.
To further validate the utility of the ReHouSED measure in characterizing housing stability, we compared it against ICD diagnostic codes at classifying episodes of unstable housing. Any post-SSVF episode with ReHouSED ≥ 0.4 was classified as “Stable”, while any episode below that threshold was classified as “Unstable”. This threshold was found empirically by calculating the true positive rate and false positive rate at various classification thresholds and choosing the value that maximized their geometric mean. For the ICD-10 code comparison, any Veteran who had at least one diagnosis code representing homelessness or housing instability during the 30-day episode window was considered “Unstably Housed”. Otherwise, they are considered “Stably Housed”. The list of ICD-9/10 codes are shown in Table A.3 in the appendix.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.