4.3. Log Disparity Metric and Statistics

Karan Bhanot; Miao Qi; John S. Erickson; Isabelle Guyon; Kristin P. Bennett

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

4.3. Log Disparity Metric and Statistics

KB Karan Bhanot

MQ Miao Qi

JE John S. Erickson

IG Isabelle Guyon

KB Kristin P. Bennett

This method is extracted from research article: Entropy (Basel), Sep 2021

The Problem of Fairness in Synthetic Healthcare Data

DOI: 10.3390/e23091165

Request a Protocol

Ask a question

Favorite

ML fairness metrics are developed to evaluate and mitigate potential inequities towards protected subgroups based on predicted outcomes of trained classification models compared with non-protected subgroups. In the context of ML fairness, disparate impact measures the discrimination between the outcome distributions for both unprotected and protected groups [39]:

Here, x is the independent variable or attribute. Both y and $y^{'} \in {0, 1}$ where 1 indicates true and 0 is false. y is the true label of a subject, $y^{'} = 1$ is the label predicted by the classification function, and $g (x)$ is a defined function that determines if a subject is in a subgroup $g (x) = 1$ or not.

We use disparate impact to create a disparity metric for synthetic data by modeling how the real and synthetic data are sampled using an approach first applied to measure disparities in RCTs [34]. We assume that y is the outcome of a classification function that indicates if point x occurs in the sample of real data, R. Similarly, $y^{'}$ is the outcome of a classification function that indicates if point x occurs in the sample of synthetic data, S. In the context of synthetic data, we assume (1) samples from the real distribution and samples generated from the synthetic distribution are independent (i.e., $y ⊥ y^{'}$ ); and (2) the real data subjects are sampled independent of subgroups (i.e., $y ⊥ g (x)$ ).

It is not obvious how to estimate the probabilities necessary for disparate impact definition (8), $P (y^{'} = 1 | g (x) = 1)$ and $P (y^{'} = 1 | g (x) = 0)$ , from the real and synthetic data. Thus, we show that when applied to the synthetic data generation, the ML metric disparate impact reduces to an intuitive quantity based on the ratio of odds of generating a subject from the protected subgroup in the synthetic data and the odds of sampling a subject from the subgroup in the real distribution. This transformation is essential to estimate probabilities in the definition (8) from the real and synthetic data. The necessary probabilities, $P (g (x) = 1 | y = 1) = P (g (x) = 1 | x \in R)$ and $P (g (x) = 1 | y^{'} = 1) = P (g (x) = 1 | x \in S)$ , are easily estimated from the actual real and synthetic data.

Synthetic Data Equity Version of ML-Fairness Disparate Impact Metric Based on the assumptions described above, the Disparate Impact metric applied to synthetic data is equivalent to the ratio of odds of subjects of the protected group in the synthetic data to the ratio of the odds of subjects in the real data. Recall that $odds (p) = \frac{p}{(1 - p)} .$

□

Taking a natural logarithm improves ease of understanding of the equity measurements by making under-represented and over-represented metric symmetric. For example, if gender = male or female, then the log disparity of the male subgroup will be the negative of the log disparity of the female subgroup. Thus, the log disparity metric is defined as below.

If we consider the synthetic and real data to be generated by sampling, it is possible that any differences observed may be by chance. Thus, we develop a statistical test with the null hypothesis $P (g (x) = 1 | x \in R) = P (g (x) = 1 | x \in S)$ . Let $p_{S} = P (g (x) = 1 | x \in S)$ with size $n_{S}$ for the subgroup in the synthetic data and $p_{R} = P (g (x) = 1 | x \in R)$ with size $n_{R}$ for the subgroup in the real data. p is the overall proportion. Then a two-proportion z-test Equation (10) is applied.

The z-test is applied when the sample size is large enough; otherwise, more accurate test for small samples such as the Fisher Exact probability test is used. We note that this t-statistic could also provide an excellent basis for another novel fairness metric for synthetic data as well.

We calculate the log disparity metrics and statistical significance for all possible subgroups defined by the protected attributes by changing $g (x)$ . Here, we assume that the protected attributes are all categorical. Since multiple subgroups are compared, we use the Benjamini–Hochberg procedure to adjust p-values through controlling the false discovery rate. If the adjusted p-value is greater than or equal to the significance level (0.05), no statistically significant difference between real and synthetic subgroup representation exist. Then, we will apply the metric such as log disparity for further evaluation.

Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol