Methods of explainable-AI (ex-AI)

Explainable-AI Pythia - the high prestress of the temple of Apollo at DelphiSummer term 2018 (date and venue tba.) – based on contents from the iML courses of last years
3 ECTS course (this page is valid as of 25.05.2018 – 11:00 CEST)

INTRODUCTION: 6 min Youtube Video on explainable AI

MOTIVATION for this course:

Methods of explainable AI is a natural offspring of the interactive Machine Learning (iML) courses held over the years before: A huge motivation for us in continuing to study our iML concept [1] – with a human in the loop [2] (see our project page) – is that modern AI/machine learning models (see the difference AI-ML) are often considered to be “black-boxes” [3] – which is of course not quite true; however, even if we understand the underlying mathematical principles it is difficult to re-enact why a certain machine decision has been reached. A general serious drawback is that such models have no explicit declarative knowledge representation, hence have difficulty in generating the required explanatory structures – the context – which considerably limits the achievement of their full potential [4]. Most of all an end-user, e.g. an expert decision maker (e.g. a medical doctor) is not interested in the internals of any model, she/he wants to retrace a result on demand in both an human understandable and usable way. Consequently, AI usability receives a new renaissance within the engineering field.

GOAL of this course

This graduate course follows a research-based teaching (RBT) approach and provides an overview of the current state-of-the-art methods on making AI transparent re-traceable, re-enactable, understandable, consequently explainable.


Explainability is motivated due to lacking transparency of so-called black-box approaches, which do not foster trust and acceptance of AI generally and ML specifically. Rising legal and privacy aspects, e.g. with the new European General Data Protection Regulations (which come into effect in May 2018) will make black-box approaches difficult to use in Business, because they often are not able to explain why a machine decision has been made (see explainable AI).
Consequently, the field of Explainable AI is emerging because raising legal, ethical, and social aspects make it mandatory to enable – on request – a human to understand and to explain why a machine decision has been made [see Wikipedia on Explainable Artificial Intelligence] (Note: that does not mean that it is always necessary to explain everything and all – but to be able to explain it if necessary – e.g. for general understanding, for teaching, for learning, for research – or in court – or even on demand by a citizen – right of explanabiltiy).


Research students of  Computer Science who are interested in knowledge discovery/data mining by following the idea of iML-approaches, i.e. human-in-the-loop learning systems. This is a cross-disciplinary computer science topic and highly relevant for the application in complex domains, such as health, biomedicine, paleontology, biology and in safetycritical domains e.g. cyberdefense.


The content includes but is not limited to:

  • Performance vs Interpretability
  • Motivation for transparent AI, re-traceablility, understandability, interpretability, explainability
  • Motivation use cases from the medical domain
  • Early AI = interpretable: examples of early expert systems from the medical domain (MYCIN, GAMUTS in Radiology, …)
  • Medical Expert Systems > Intelligent Tutoring Systems
  • Physics > Physiology > Psychology (signal processing > perception > cognition)
  • What is interpretable by a human? What is not interpretable by a human?
  • Global vs. local explainability
  • Explainability – Interpretability – Understandability
  • Ante-hoc vs. Post-hoc interpretability
  • Model interpretability vs. Data interpretability (Focusing on data vs. focusing on model)
  • Compliance to new Legislation (European Data Protection Regulation – right of explanation,
    Note: This is particularly important for young start-up companies in AI and Machine Learning)
  • Techniques of Explainability (related work)
  • Explainable Medicine: Ensemble view vs. patients view (stakeholder view)
  • Transparency and trust – (how) can we measure trust?
  • Making Deep Neural Networks transparent
  • Activation Maximization Method
  • POST-HOC (… selecting an appropriate model for the given problem and develop a special technique to interpret it)
    • LIME,
    • BETA,
    • LRP
  • ANTE-HOC (… selecting a model that is already explainable and train it to the problem)
    • iML,
    • GAM,
    • Stochastic-AOG
    • Hybrid models
    • Deep Symbolic Networks (Deep learning transparency)
  • Example: interactive Machine Learning with the Human-in-the-Loop
  • Example: Interpretable Deep Learning Model (making neural networks transparent)


[1]          Andreas Holzinger, Chris Biemann, Constantinos S. Pattichis & Douglas B. Kell (2017). What do we need to build explainable AI systems for the medical domain? arXiv:1712.09923.

[2]         Andreas Holzinger, Bernd Malle, Peter Kieseberg, Peter M. Roth, Heimo Müller, Robert Reihs & Kurt Zatloukal (2017). Towards the Augmented Pathologist: Challenges of Explainable-AI in Digital Pathology. arXiv:1712.06657.

[3]         Andreas Holzinger, Markus Plass, Katharina Holzinger, Gloria Cerasela Crisan, Camelia-M. Pintea & Vasile Palade (2017). A glass-box interactive machine learning approach for solving NP-hard problems with the human-in-the-loop. arXiv:1708.01104.

[4]          Andreas Holzinger (2016). Interactive Machine Learning for Health Informatics: When do we need the human-in-the-loop? Brain Informatics, 3, (2), 119-131, doi:10.1007/s40708-016-0042-6.

[5]          Andreas Holzinger (2018). Explainable AI (ex-AI). Informatik-Spektrum, 41, (2), 138-143, doi:10.1007/s00287-018-1102-5.

Some Quick Explanations:

Ante-hoc Explainability (AHE) := such models are interpretable by design, e.g. glass-box approaches; typical examples include linear regression, decision trees and fuzzy inference systems; they have a long tradition and can be designed from expert knowledge or from data and are useful as framework for the interaction between human knowledge and hidden knowledge in the data.

Explainability := motivated by the opaqueness of so called “black-box” approaches it is the ability to provide an explanation on why a machine decision has been reached (e.g. why is it a cat what the deep network recognized). Finding an appropriate explanation is difficult, because this needs understanding the context and providing a description of causality and consequences of a given fact. (German: Erklärbarkeit; siehe auch: Verstehbarkeit, Nachvollziehbarkeit, Zurückverfolgbarkeit, Transparenz)

Explanation := set of statements to describe a given set of facts to clarify causality, context and consequences thereof and is a core topic of knowledge discovery involving “why” questionss (“Why is this a cat?”). (German: Erklärung, Begründung)

Explanatory power := is the ability of a set hypothesis to effectively explain the subject matter it pertains to (opposite: explanatory impotence).

European General Data Protection Regulation (EU GDPR) :=  Regulation EU 2016/679 – see the EUR-Lex 32016R0679 , will make black-box approaches difficult to use, because they often are not able to explain why a decision has been made (see explainable AI).

Interactive Machine Learning (iML) := machine learning algorithms which can interact with – partly human – agents and can optimize its learning behaviour trough this interaction. Holzinger, A. 2016. Interactive Machine Learning for Health Informatics: When do we need the human-in-the-loop? Brain Informatics (BRIN), 3, (2), 119-131.

Post-hoc Explainability (PHE) := such models provide local explanations for a specific decision and re-enact on request, typical examples include LIME, BETA etc.

Preference learning (PL) := concerns problems in learning to rank, i.e. learning a predictive preference model from observed preference information, e.g. with label ranking, instance ranking, or object ranking.  Fürnkranz, J., Hüllermeier, E., Cheng, W. & Park, S.-H. 2012. Preference-based reinforcement learning: a formal framework and a policy iteration algorithm. Machine Learning, 89, (1-2), 123-156.

Multi-Agent Systems (MAS) := include collections of several independent agents, could also be a mixture of computer agents and human agents. An exellent pointer of the later one is: Jennings, N. R., Moreau, L., Nicholson, D., Ramchurn, S. D., Roberts, S., Rodden, T. & Rogers, A. 2014. On human-agent collectives. Communications of the ACM, 80-88.

Transfer Learning (TL) := The ability of an algorithm to recognize and apply knowledge and skills learned in previous tasks to
novel tasks or new domains, which share some commonality. Central question: Given a target task, how do we identify the
commonality between the task and previous tasks, and transfer the knowledge from the previous tasks to the target one?
Pan, S. J. & Yang, Q. 2010. A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering, 22, (10), 1345-1359, doi:10.1109/tkde.2009.191.


Time Line of relevant events for interactive Machine Learning (iML):

According to John Launchbury (watch his excellent video on Youtube) from DARPA there can be determined three waves of AI:

Wave 1: Handcrafted Knowledge – enables reasoning, explainability and re-traceabilty over narrowly defined problems; this is what we call classic AI, and consists for mostly rule-based systems.
Wave 2: Statistical Learning – black-box models, with no contextual capability and minimal reasoning ability needs big data; the fact that probabilsitic learning models can cope with stochastic and non-deterministic problems.
Wave 3: Contextual Adaptation – sytems may contruct contextual explanatory models for classes of real world phenomena and glass-box models allow to re-enact on decisons, to answer the question on “why” a machine decision has been reached. This is what humans can do very well – and thus, if we want to excel in this area we have to understand the underlying principles of intelligence.

Glossary (incomplete)

Dimension = n attributes which jointly describe a property.

Features = any measurements, attributes or traits representing the data. Features are key for learning and understanding. Andrew Ng emphasizes that machine learning is mostly feature engineering.

Reals = numbers expressible as finite/infinite decimals.

Regression = predicting the value of a random variable y from a measurement x.

Reinforcement learning = adaptive control, i.e. to learn how to (re-)act in a given environment, given delayed/ non-deterministic rewards.  Human learning is mostly reinforcement learning.

Historic People (incomplete)

Bayes, Thomas (1702-1761) gave a straightforward definition of probability [Wikipedia]

Laplace, Pierre-Simon, Marquis de (1749-1827) developed the Bayesian interpretation of probability [Wikipedia]

Price, Richard (1723-1791) edited and commented the work of Thomas Bayes in 1763 [Wikipedia]

Tukey, John Wilder (1915-2000) suggested in 1962 together with Frederick Mosteller the name “data analysis” for computational statistical sciences, which became much later the name data science [Wikipedia]

Antonyms (incomplete)

big data sets < > small data sets

correlation < > causality

discriminative < > generative

Frequentist < > Bayesian

Independent identical distributed data (IID-Data) <>non independent identical distributed data (non-IID)

low dimensional < > high dimensional

underfitting < > overfitting

parametric < > non-parametric

supervised < > unsupervised