3.3 Model-agnostic explainable AI methods

Athira Nambiar; Harikrishnaa S; Sharanprasath S

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

3.3 Model-agnostic explainable AI methods

AN Athira Nambiar

HS Harikrishnaa S

SS Sharanprasath S

This method is extracted from research article: Front Artif Intell, Dec 2023

Model-agnostic explainable artificial intelligence tools for severity prediction and symptom analysis on Indian COVID-19 data

DOI: 10.3389/frai.2023.1272506

Ask a question

Favorite

SHAP (SHapley Additive exPlanations) is a popular model-agnostic technique for explaining the output of any machine learning model (Mangalathu et al., 2020). It uses Shapley values from cooperative game theory that quantify the contribution of each player to a coalition. Specifically, it attributes the importance of each feature to the final prediction by calculating the contribution of each feature to the difference between the predicted value and a baseline reference value, assigning credit or blame to each feature based on how much it shifts the prediction away from the baseline.

In the context of feature attribution in machine learning, the Shapley value can be used to allocate the contribution of each input feature to the prediction of the model. In other words, the SHAP value of a feature represents the contribution of that feature to the difference between the actual output and the expected output of the model. Formally, the SHAP value of a feature for a specific instance x can be defined as shown in Equation (10).

where, ϕ_i(x) represents the SHAP value of feature i for instance x. Note that K and N are the total number of input features and the set of all input features, respectively. S corresponds to the subset of N that does not contain feature i. The model prediction function is termed as f. Further, x_S is the instance with the features in S set to their expected values and x_S∪{i} is the instance with feature i set to its actual value.

The SHAP values help to assign the contribution of each feature toward the model prediction with the help of summary plots, wherein the absolute SHAP scores rank the features by their importance. In addition to the global prediction, SHAP values also provide a local explanation for a given instance. It shows the influence of features contributes to the prediction and can be used to explain why a particular prediction was made. The SHAP-based explanations can help in diagnosing issues with the model, assessing the fairness of the model and comparing the feature importance of different models.

LIME (Local Interpretable Model-Agnostic Explanations) is yet another post-hoc explanation technique for explaining ML models (Mishra et al., 2017). LIME justifications can increase user confidence in an AI system. The goal of LIME is to provide explanations that are both locally faithful to the model and interpretable to humans.

LIME generates a simpler, interpretable model called the “local surrogate model” around the prediction that it wants to explain. This local surrogate model is trained on a set of perturbed instances around x and is used to generate explanations by examining the feature importance values of the simpler model. In other words, LIME approximates the model locally using an interpretable model such as linear regression or decision tree and generates explanations by perturbing the input instance and observing the effect on the output of the model. The mathematical formulation of local surrogate models with interpretability constraint is expressed as in Equation (11).

To explain a model's prediction for a particular instance x, LIME generates an explanation model represented by g, that minimizes a loss function L. This loss function evaluates how accurately the explanation model g approximates the prediction of the original model f. Note that G refers to the family of possible explanations (e.g., all possible linear regression models.) and proximity measure π_x corresponds to the vastness of the neighborhood around instance x that is considered for explanation. The regularization term Ω(g) corresponding to the model complexity is kept low to prefer fewer features.

LIME can be used to visualize the feature importance values in various ways, e.g., bar chart, to help users understand how different features contribute to the prediction. The LIME-based explanation can help users understand the reasoning behind the model's predictions and can be useful for debugging and improving the model.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol