Global explanations of Neural Networks: Mapping the landscape of predictions
A new approach for generating global attributions that explain neural network predictions across different subpopulations.
Topics:
A barrier to the wider adoption of neural networks is their lack of interpretability. While local explanation methods exist for one prediction, most global attributions still reduce neural network decisions to a single set of features. In response, we present an approach for generating global attributions called GAM, which explains the landscape of neural network predictions across subpopulations. GAM augments global explanations with the proportion of samples that each attribution best explains and specifies which samples are described by each attribution. Global explanations also have tunable granularity to detect more or fewer subpopulations. We demonstrate that GAM's global explanations 1) yield the known feature importances of simulated data, 2) match feature weights of interpretable statistical models on real data and 3) are intuitive to practitioners through user studies. With more transparent predictions, GAM can help ensure neural network decisions are generated for the right reasons.
Latest publications
Gaussian process neural additive models
New Gaussian Process Neural Additive Models enhance explainability in deep learning for tabular data.
AAAISensitive data detection with high-throughput Neural Network Models for financial institutions
Evaluating deep learning models for detecting sensitive information in financial documents to enhance data security and privacy.
AAAIIs poisoning a real threat to LLM alignment? Maybe more so than you think.
The vulnerabilities of DPO to poisoning attacks and the effectiveness of preference poisoning.
AAAI