Latent-cf: A simple baseline for reverse counterfactual explanations
A simple and effective baseline method for generating counterfactual explanations using latent space search with autoencoders.
Topics:
In the environment of fair lending laws and the General Data Protection Regulation (GDPR), the ability to explain a model's prediction is of paramount importance. High quality explanations are the first step in assessing fairness. Counterfactuals are valuable tools for explainability. They provide actionable, comprehensible explanations for the individual who is subject to decisions made from the prediction. It is important to find a baseline for producing them. We propose a simple method for generating counterfactuals by using gradient descent to search in the latent space of an autoencoder and benchmark our method against approaches that search for counterfactuals in feature space. Additionally, we implement metrics to concretely evaluate the quality of the counterfactuals. We show that latent space counterfactual generation strikes a balance between the speed of basic feature gradient descent methods and the sparseness and authenticity of counterfactuals generated by more complex feature space oriented techniques.
Latest publications
Searching for efficient linear layers over a continuous space of structured matrices
Searching for efficient linear operators with optimal scaling laws leading to the development of the BTT-MoE architecture.
NeurIPSScaling-laws for large time-series models
Discovering power-law scaling relationships in large time-series transformer models, analogous to those found in language models.
NeurIPSSimplifying neural network training under class imbalance
Improving neural network performance on imbalanced datasets by tuning standard training components, without specialized methods.
NeurIPS