Explainable AI

My research into Explainable AI (XAI) taught me how little I know about AI systems. The focus of this mini-essay is on Explainable AI, what it is, why it's important, and how we can create it. Generative AI is a whole other topic, and won't be discussed here.

AI systems are trained and operate in a number of ways. Simpler systems use models like Decision Trees, which essentially consists of a root node, and classifies every sub-node into some bucket. This continues all the way down the line (insert image). As a user, if we wanted to understand how our system reached its final output, it would be incredibly easy to follow the path of classifications and see the output. However, models that use Decision Trees are incredibly simple, and often can't be used to perform the tasks we need. Instead, many of the AI systems we use on a daily basis run on black box models.

So what's a back box model?

Essentially, a black box (or opaque) model is one where the user has no visibility into the inputs, or operations of the system. Simply put, if you asked the AI a question or to perform an analysis, you would receive an output, but not be able to figure out how the model got to that outcome. Here, it's important to define a few terms. Transparency refers to the ability to 'see' into the model, and have visibility over what's going on. Interpretability is the ability to understand the inner workings of the model, and explainability is the ability for the model to explain why it made the decisions it did. These are all discrete terms, albeit with some overlap.

So what's Explainable AI?

This is where a system is able to explain how and **why** it reached its output. The 'how' part of the equation refers a little more to interpretability, but the why is where we nail down explainability. The two main methods of performing this currently are the use of white box models, or ad-hoc explainers. "A post-hoc XAI method receives a trained and/or tested AI model as input, then generates useful approximations of the model’s inner working and decision logic by producing understandable representations in the form of feature importance scores, rule sets, heatmaps, or natural language." ([Link](https://arxiv.org/pdf/2005.01992.pdf#:~:text=Post%2Dhoc%20explainability%20can%20be,a%20record%20and%20an%20outcome.))

In trying to keep these mini essays under 500 words, that's the gist of it.


Key takeaways:

  • Interpretability, transparency and explainability are different

  • Complex models will likely be black box, but can use post hoc XAI methods to make them more explainable

  • Simpler models can be trained using decision trees, linear regression models or fuzzy interference systems for a more transparent system, but will have the drawback of not being able to perform as complex tasks.