Fairness metrics

Zifan Jiang; Salman Seyedi; Emily Griner; Ahmed Abbasi; Ali Bahrami Rad; Hyeokhyen Kwon; Robert O. Cotes; Gari D. Clifford

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Fairness metrics

ZJ Zifan Jiang

SS Salman Seyedi

EG Emily Griner

AA Ahmed Abbasi

AR Ali Bahrami Rad

HK Hyeokhyen Kwon

RC Robert O. Cotes

GC Gari D. Clifford

This method is extracted from research article: PLOS Digit Health, Jul 2024

Evaluating and mitigating unfairness in multimodal remote mental health assessments

DOI: 10.1371/journal.pdig.0000413

Request a Protocol

Ask a question

Favorite

The fairness of the dataset and classifications were evaluated. We analyzed the fairness level of both self-rated and clinically-rated labels and focused the algorithmic fairness analysis on classifying clinically rated labels.

The number of subjects from different demographic groups and the selection rates (SR) in different groups were calculated. The selection rate was defined as the percentage of samples being “positive”, meaning clinically-rated MHC or self-rated depression, or self-rated anxiety.

Following the DP defined in [29], we used two definitions of the demographic parity ratio to measure the fairness level of the classifications. For sensitive demographic variable k with G different groups and a g* social-economically privileged group: The first demographic parity ratio (DPR) captured overall parity between any pairs of groups and was defined as:

and g ∈ G_k; The second demographic parity ratio focused on the parity compared to the privileged group and was defined as:

where g ≠ g* and $S$ is the Selection Rate of the utilized classifier, i.e., the ratio of positive classification.

As privileged groups, we defined “male” for gender parity analysis, “white” for race parity analysis, “Older (≥40)” for age parity analysis, and “College or below (≤16 years of education)” for education parity analysis. Using classification results of the test folds in 100 repeated fold-fold cross-validation (detailed descriptions in Section Multimodal assessment of mental health conditions and [19]), DPRs of classifiers trained with features from different modalities were calculated. DPR being further from one means a larger disparity between the privileged and unprivileged groups.

Similar to DPR metrics defined in the above section, we followed EO definition proposed in [30] and defined the overall and over-privileged equalized odds ratios (EOR) based on false positive rate (FPR) and true positive rate (TPR). The first EOR was defined as:

The second equalized odds ratio was defined as:

where g ≠ g* and δ was set to 0.001 to avoid ratios being divided by zero. Similarly, EORs were calculated for each classifier. EOR of one means that all groups have the same TPR and FPR.

This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol