The following papers made an attepmt to clarify the definition, intension, intuition and motivation of Interpretable Machine Learning.
Generally, these papers tried to build up the foundation of Interpretable Machine Learning and I would like to summerize here.
- Interpretability is
- to present abstract concept in an understandable term for humans
- ...
- Interpretability for
- the prediction of a specific instance
- the mechanism of the whole model
- Benefit
- applications in critical fileds
- the performance of model (remove some useless part in complex model)
- the study of complex model
- ...
- Categories
- Self-interpretable Model
- Post-hoc Interpretations
- Evaluations
- Proxy metrics
- Human experiments