Lime: Explaining the predictions of any machine learning classifier
Lime is basically short for Local Interpretable Model-Agnostic Explanations. If we notice, every part of the name reflects something that we desire in explanations.
Here, Local refers to the explanation that we really want in order to reflect the behaviour of the classifier "around" the instance being predicted. But unless it is interpretable (Human making a sense of it), this explanation is useless. Lime without needing to 'peak' into it is able to explain any model, so it is model-agnostic.
First of all, talking about interpretability. Some classifiers make the use of representations that to users are not intuitive at all. Lime in terms of interpretable representations (words), explains those classifiers even if actually that is not the representation used by the classifier. Further, the explanations are not too long that it lime takes human limitations into account. As of now although work is being done on other representations, the current package supports explanations that are sparse linear models.
Moving on, lime can't peak into the model in order to be model-agnostic. Also, to figure out what parts of the interpretable input are contributing to the prediction, the input is perturbed around its neighbourhood and then the behaviour of the predictions model is observed. These perturbed data points can then be weight by their proximity to the original example, and further, learn an interpretable model on those and the associated predictions.
What was the Intuition behind LIME?
Because the main aim was to build a model that is model-agnostic, what can be done in order to learn the behaviour of the underlying model is to see how the predictions change by the perturbing of the input. This in terms of interpretability turns out to be a benefit, because by changing components we can perturb the input that makes sense to humans even if the model is using much more complicated components as features like say word embeddings.
An explanation is generated by approximating the underlying model by an interpretable one (such as a linear model with only a few non-zero coefficients), that are learned on perturbations of the original instance (e.g., removing words or hiding parts of the image).
The key intuition behind LIME is that it is much easier to locally approximate a black-box model by a simple model (in the neighbourhood of the prediction that is to be explained), as opposed to trying to approximate a model globally. This is done by weighting the perturbed images by their similarity to the instance we want to explain.
Plans ahead with LIME:
The current plan is to add more packages that help users understand as well as interact meaningfully with machine learning.
Lime, as of now, is able to explain with two or more classes any black box classifier. All that is required is that the classifier outputs a probability for each class by implementing a function that takes in raw text or a Numpy array. Adding on, There is a built-in Support for scikit-learn classifiers.
For effective human interaction when we take machine learning systems into consideration, trust is crucial, and explaining individual predictions is thought of as an effective way of assessing trust. In order to facilitate such trust for machine learning practitioners LIME turns out to be an effective tool and also a good choice to add to their tool belts but in order to better explain machine learning models, there is still plenty of work that is to be done.
What is to be seen is that where this research direction will lead us.
For More Information: GitHub
Link To The Paper: Click Here
KDD2016 paper 573: KDD2016 video
Video Source: KDD2016 video