PyData Yerevan 2022

Explainable AI as a Conventional Data Analysis Tool
08-12, 14:30–15:10 (Asia/Yerevan), 213W PAB

The recent surge of interest in Machine Learning (ML) and Artificial Intelligence (AI) has spurred a wide array of models designed to make decisions in a variety of domains, including healthcare [1, 2, 3], financial systems [4, 5, 6, 7], and criminal justice [8, 9, 10], just to name a few. When evaluating alternative models, it may seem natural to prefer those that are more accurate. However, the obsession with accuracy has led to unintended consequences, as developers often strove to achieve greater accuracy at the expense of interpretability by making their models increasingly complicated and harder to understand [11]. This lack of interpretability becomes a serious concern when the model is entrusted with the power to make critical decisions that affect people’s well-being. These concerns have been manifested by the European Union’s recent General Data Protection Regulation, which guarantees a right to explanation, i.e., a right to understand the rationale behind an algorithmic decision that affects individuals negatively [12]. To address these issues, a number of techniques have been proposed to make the decision-making process of AI more understandable to humans. These “Explainable AI ” techniques (commonly abbreviated as XAI) are the primary focus of this talk. The talk will be divided into three sections, during which the audience will learn (i) the differences between existing XAI techniques, (ii) the practical implementation of some well-known XAI techniques, and (iii) possible uses of XAI as a conventional data analysis tool.

  1. The general idea and the difference between existing XAI techniques. (Duration ≈ 10 minutes) Although black-box models are hard to interpret by humans, they tend to have higher prediction accuracy compared to their transparent counterparts. This trade-off between accuracy and transparency gives raise to the black-box explanation problem, which involves explaining the rationale behind the decisions made by black-box models. By providing such explanations, one can continue to use highly-accurate black-box models without sacrificing transparency. Black-box explanation problems can be categorized into the following: • Model explanation problems, which require explaining the underlying logic behind a black-box model. This is typically done by approximating the black-box behaviour using an alternative model that is more transparent and interpretable. • Model inspection problems, which require providing visual or textual explanations of certain properties of the underlying model or its outcome, with the goal being to understand how internally the black-box behaves when the input is changed. • Outcome explanation problems, which require explaining the model’s outcome given an instance of interest by either explaining how the outcome was generated or explaining how the outcome can be changed using counterfactual analysis. Each of the above problems can be addressed using different types of techniques, which can be classified as follows: • Model-specific techniques, which exploit the parameters and features of the model they are designed to explain. The power of such techniques stems from their ability to access the model internals, such as weights or structure, but this power comes at a price since they cannot readily be generalised to other models. • Model-agnostic techniques, which in principle, can be used on any machine learning model to provide post-hoc explanations, i.e., explanations that are generated after the model has been trained [13]. The disadvantage of these techniques is that they cannot take advantage of model internals since they are only capable of analysing input-out pairs. 1.1 Takeaway The key takeaway from this section is the different types of existing XAI techniques. The audience will learn which technique can be used in which situation and what to expect as an outcome.

  2. Explainable AI as a conventional data analysis tool (Duration ≈ 10 minutes) Motivated by the wealth of innovative XAI techniques that were proposed in recent years, some researchers explored the possibility of using these techniques outside of their originally-intended scope. XAI techniques can be used to provide insights into the data set itself, rather than providing explanations to ML models and their outcomes. Taking this into consideration, the growing literature on XAI can support existing data analysis methods. Arguably, one of the most widely-applied such methods is regression analysis, which is typically used to estimate the relationship between the dependent variable and one or more independent features. Importantly, the way in which regression analysis works is fundamentally different from the way in which LIME and SHAP work. The former provides a global view of the data, whereas the latter two provide a local view. More specifically, LIME and SHAP provide insights on a specific input instance, whereas regression estimates a high- level relationship, taking into consideration the entire data set. Thus, the key observation is that the high-level view (which is often the primary concern of data analysts) may benefit from additional details obtained by zooming into the different instances, i.e., by applying LIME and SHAP not on a single instance of interest as originally intended by the developers of these techniques, but rather on all instances in the data set. 3.1 Takeaway The key takeaway from this section is the demonstration of possible other uses of XAI techniques. Some examples will exhibit how data analysts could benefit from the additional insights into the data set.

Prior Knowledge Expected

Previous knowledge expected

Maria Sahakyan is a postdoctoral associate at New York University Abu Dhabi specializing in Explainable Artificial Intelligence. She earned her Ph.D. in Interdisciplinary Engineering from Khalifa University Abu Dhabi, specializing in Explainable Artificial Intelligence (XAI).