The Problem with explainable-AI

A very nice and interesting article by Rudina SESERI in the recent TechCrunch blog (read the orginal blog entry below): at first Rudina points out that the main problem is in data; and yes, indeed, data should always be the first consideration. We consider it a big problem that successful ML approaches (e.g. the mentioned deep learning, our PhD students can tell you a thing or two about it 😉 greatly benefit from big data (the bigger the better) with many training sets; However, it certain domain, e.g. in the health domain we sometimes are confronted with a small number of data sets or rare events, where we suffer of insufficient training samples [1]. This calls for more research towards how we can learn from little data (zero-shot learning), similar as we humans do: Rudina does not need to show her children 10 million samples of a dog and a cat, so that her children can safely discriminate a dog from a cat. However, what I miss in this article is something different, the word trust. Can we trust our machine learning results? [2] Whilst, for sure we do not need to explain everything all the time, we need possibilities to make machine decisions transparent on demand and to check if something could be plausible. Consequently, Explainable AI can be very important to foster trust in machine learning specifically and artificial intelligence generally.





MIT emphasizes the importance of HCI for explainable AI

In a joint project “The car can explain” with the TOYOTA Research Institute  the MIT Computer Science & Artificial Intelligence Lab  are working on explainable AI and emphasize the increasing importance of the field of HCI (Human-Computer Interaction) in this regard. Particularly, the group led by Lalana KAGAL is working on monitors for reasoning and explaining: a methodological tool to interpret and detect inconsistent machine behavior by imposing constraints of reasonableness. “Reasonable monitors” are implemented as two types of interfaces around their complex AI/ML frameworks. Local monitors check the behavior of a specific subsystem, and non-local reasonableness monitors watch the behavior of multiple subsystems working together: neighborhoods of interconnected subsystems that share a common task. This enormously interesting monitoring consistently checks that the neighborhood of subsystems are cooperating as expected. Insights of this projects could also be valuable for the health informatics domain:

Google Brain says Explainability is the “new deep learning”

There is a very interesting interview in the Talking Machines*) series from May, 31, 2018. Katherine GORMAN interviews Maithra RAGHU **) from the Google Brain Team, where she mentioned that “explainability is the new deep learning”, and it is particularly important for health informatics, where it is important to re-trace, re-enact and to understand and explain why a machine decision has been reached. This is super for us, because when I tell my students that this is important, nobody believes me; but now I can emphasize that not I am saying that, but Google Brain is saying it. Excellent.

However, the whole field needs a lot of work, before we can provide useable solutions for the end-user in daily routine (e.g. a medical doctor); urgently needed are approaches to explainable User Interfaces and most of all a research framework for testing explainability.

*) Talking Machine is an excellent, highly recommendable Podcast series, founded by Katharine GORMAN and Ryan ADAMS in 2015 and now run by Katharine together with Neil LAWRENCE (who leads the Amazon Research in Cambridge, UK).

**) Maithra RAGHU is currently a PhD working with Jon KLEINBERG at Cornell (see ), where she is doing extended research with the Google Brain Team, see:
Maithra has published some very interesting papers, e.g.: Maithra Raghu, Justin Gilmer, Jason Yosinski & Jascha Sohl-Dickstein. SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability. Advances in Neural Information Processing Systems, 2017. 6078-6087.
or this is also very interesting:
Ben Poole, Subhaneil Lahiri, Maithra Raghu, Jascha Sohl-Dickstein & Surya Ganguli. Exponential expressivity in deep neural networks through transient chaos. Advances in neural information processing systems, 2016. 3360-3368.

Google Brain says we urgently need a Research Framework around the field of interpretability

In a recent interview Been KIM from the Google Brain team emphasizes the significance of research in explainable AI. Particularly, she emphasized the importance of Human-Computer Interaction (HCI) for Artificial Intelligence generally and Machine Learning specifically (see the differences between AI and ML here), and the urgent need of an research framework around the field of interpretability. Listen to the episode six of season four of Talking Machines by Katherine GORMAN and Neil LAWRENCE here (Start at approx. 26:00):

Been KIM is a research scientist at the Google Brain team and is interested in designing machine learning methods that  make sense to humans. Her current focus is building interpretability methods for already-trained models (e.g., high performance neural networks). In particular, she believes that the language of explanations should include higher-level, human-friendly concepts.  Been gave a tutorial on explainable AI at ICML 2017 and recently the group published the paper: Menaka Narayanan, Emily Chen, Jeffrey He, Been Kim, Sam Gershman & Finale Doshi-Velez 2018. How do Humans Understand Explanations from Machine Learning Systems? An Evaluation of the Human-Interpretability of Explanation. arXiv:1802.00682.

“Machine Learning for Health Informatics” Lecture Notes in Artificial Intelligence 9605 > 40,626 downloads 2017

Since its online publication on December 10, 2016 the Volume edited by Andreas Holzinger “Machine Learning for Health Informatics” Springer Lecture Notes in Artificial Intelligence LNAI Volume 9605, has been downloaded 54,960 times as of today (May, 11, 2018, 20:00 CEST) and 44,988 with status as of April 2018 according to the official Springer Bookmetrix book performance report – a record; and alone in the year 2017 40,626 downloads, which is 10 times higher than a typical volume of the series of Lecture Notes in Artificial Intelligence by Springer/Nature. A cordial thank you for my international colleagues for this huge acceptance!

A popular passage from the book:



AI will change Radiology – NOT replace Radiologists

After the rather shocking statement of Geoffrey HINTON during the Machine Learning and Market for Intelligence Conference in Toronto, where he recommended that hospitals should stop training radiologists, because deep learning will replace them (watch video below), on March, 27, 2018 Thomas H. DAVENPORT and Keith J. DREYER published a really nice article on “AI will change radiology, but it won’t replace radiologists” (see [1]) – which supports our human-in-the-loop approach: for sure, AI/machine learning (difference here) will change workflows, but we envision that the expert will be augmented by new technologies, i.e. routine (boring) tasks will be replaced by automatic algorithms, but this will free up expert time to spent on challenging (cool) tasks and more research – and there are plenty of problems where we need human intelligence!




A good proof of the importance of the HCI-KDD approach, worth: 2,1 Billion USD !

Our strategic aim is to find solutions for data intensive problems by the combination of two areas, which bring ideal pre-conditions towards understanding intelligence and to bring business value in AI: Human-Computer Interaction (HCI) and Knowledge Discovery (KDD). HCI deals with questions of human intelligence, whereas KDD deals with questions of artificial intelligence, in particular with the development of scalable algorithms for finding previously unknown relationships in data, thus centers on automatic computational methods. A proverb attributed perhaps incorrectly to Albert Einstein illustrates this perfectly: “Computers are incredibly fast, accurate, but stupid. Humans are incredibly slow, inaccurate, but brilliant. Together they may be powerful beyond imagination” [1].

An article published on February, 18, 2018 by David Shaywitz [2] from Forbes reports on the recent purchase of  the oncolology data company Flatiron Health for the enormous sum of 2,1 Billion USD (remember: Deep Mind was purchased by Google for a mere 400 million GBP 😉

This supports a few hypotheses which I try to convince my students all the time (but they won’t believe me unless Google is doing it 😉

a) those who can turn raw health data into insights and understandable knowledge can produce value
b) data – and particularly big data – is useless for the decision maker, what they need is reliable, valuable and trustworthy information
c) for the complexity of sensemaking from health data we (still) need a human-in-the-loop:  Humans (still) exceed machine performance in understanding the context and explaining the underlying explanatory factors of the data
d) consequently this is a good example for the business value of our HCI-KDD approach: Let the computer find in arbitrarily high-dimensional spaces what no human is able to do – but let the human do what no computer is able to do: BOTH working together are powerful beyond imagination!

Flatiron Health [3] is a company which is specialized on health data curation, supported by technology of course, but mostly done manually by human experts in the Mechanical Turk style. Remark: The name mechanical turk has historic origins as it was inspired by an automatic 18th-century chess-playing machine by Wolfgang von Kempelen,  that beats e.g. Benjamin Franklin in chess playing – and was acclaimed as “AI”. However, ti was later revealed that it was neither a machine nor an automatic device – in fact it was a human chess master hidden in a secret space under the chessboard and controlling the movements of an humanoid dummy. Similarly,  services which help to solve problems via human intelligence are called “Mechanical Turk online services”.

[1] Holzinger, A. 2013. Human–Computer Interaction and Knowledge Discovery (HCI-KDD): What is the benefit of bringing those two fields to work together? In: Cuzzocrea, Alfredo, Kittl, Christian, Simos, Dimitris E., Weippl, Edgar & Xu, Lida (eds.) Multidisciplinary Research and Practice for Information Systems, Springer Lecture Notes in Computer Science LNCS 8127. Heidelberg, Berlin, New York: Springer, pp. 319-328, doi:10.1007/978-3-642-40511-2_22



On-Device Machine Intelligence

One very interesting approach of federated machine learning is presented by Sujith Ravi from Google: Machine learning models (e.g. CNN) are sucessfully used for the design of intelligent systems capable of visual recognition, speech and language understanding. Most of these are running on a cloud – which is often inpredictable where it is physically running. A huge problem so far is that typical machine learning models are awkward to use on mobile devices due to both computational and memory constraints. While these devices could make use of models running on high-performance data centers with CPUs or GPUs, this is not feasible for many applications and scenarios where inference needs to be performed directly “on” device. This requires re-thinking existing machine learning algorithms and coming up with new models that are directly optimized for on-device machine intelligence rather than doing post-hoc model compression. Sujith Ravi is introducing a novel “projection-based” machine learning system for training compact neural networks. The approach uses a joint optimization framework to simultaneously train a “full” deep network and a lightweight “projection” network. Unlike the full deep network, the projection network uses random projection operations that are efficient to compute and operates in bit space yielding a low memory footprint. The system is trained end-to-end using backpropagation. Ravi shows that the approach is flexible and easily extensible to other machine learning paradigms, for example, they can learn graph-based projection models using label propagation. The trained “projection” models are then directly used for inference, please watch the origial video on:


Prefetching – Predicting what will be most likely needed next

A very interesting paper has just been published  about prefetching, which is a nice machine learning solution: predicting which information will be most likely useful next and consequently can be prepared in advance:

Milad Hashemi, Kevin Swersky, Jamie A Smith, Grant Ayers, Heiner Litz, Jichuan Chang, Christos Kozyrakis & Parthasarathy Ranganathan 2018. Learning Memory Access Patterns. arXiv preprint arXiv:1803.02329.

Prefetching is the process of predicting future memory accesses that will miss in the on-chip cache and access memory based on past history. Each of these memory addresses are generated by a memory instruction (a load/store). Memory instructions are a subset of all instructions that interact with
the addressable memory of the computer system.


There is a nice article in the MIT Technology Review by Will Knight on March, 8, 2018 on the similarities on how human improve their behaviour with age – a very nice read:

iML with the human-in-the-loop mentioned among 10 coolest applications of machine learning

Within the “Two Minute Papers” series, Karol Károly Zsolnai-Fehér from the Institute of Computer Graphics and Algorithms at the Vienna University of Technology mentions among “10 even cooler Deep Learning Applications” our human-in-the-loop paper:

Seid Muhie Yimam, Chris Biemann, Ljiljana Majnaric, Šefket Šabanović & Andreas Holzinger 2016. An adaptive annotation approach for biomedical entity and relation recognition. Springer/Nature: Brain Informatics, 3, (3), 157-168, doi:10.1007/s40708-016-0036-4

Watch the video here (iML is mentinoned from approx. 1:20):

Here the list of all 10 papers discussed within this 2-minutes-video

1. Geolocation –
2. Super-resolution –
3. Neural Network visualizer –…
4. Recurrent neural network for sentence completion:
5. Human-in-the-loop and Doctor-in-the-loop:
6. Emoji suggestions for images –
7. MNIST handwritten numbers in HD –
8. Deep Learning solution to the Netflix prize –
9. Curating works of art –
10. More robust neural networks against adversarial examples –…
The Keras library:

A) The basic principle of the iML human-in-the-loop approach:

Andreas Holzinger 2016. Interactive Machine Learning for Health Informatics: When do we need the human-in-the-loop? Brain Informatics, 3, (2), 119-131, doi:10.1007/s40708-016-0042-6

B) The entry in the GI Lexikon:

C) The experimental proof-of-concept:

Andreas Holzinger, Markus Plass, Katharina Holzinger, Gloria Cerasela Crisan, Camelia-M. Pintea & Vasile Palade 2017. A glass-box interactive machine learning approach for solving NP-hard problems with the human-in-the-loop. arXiv:1708.01104.

D) Outline and Survey of application possibilities:

Andreas Holzinger, Chris Biemann, Constantinos S. Pattichis & Douglas B. Kell 2017. What do we need to build explainable AI systems for the medical domain? arXiv:1712.09923.

Andreas Holzinger, Bernd Malle, Peter Kieseberg, Peter M. Roth, Heimo Müller, Robert Reihs & Kurt Zatloukal 2017. Towards the Augmented Pathologist: Challenges of Explainable-AI in Digital Pathology. arXiv:1712.06657.