A good proof of the importance of the HCI-KDD approach, worth: 2,1 Billion USD !

Our strategic aim is to find solutions for data intensive problems by the combination of two areas, which bring ideal pre-conditions towards understanding intelligence and to bring business value in AI: Human-Computer Interaction (HCI) and Knowledge Discovery (KDD). HCI deals with questions of human intelligence, whereas KDD deals with questions of artificial intelligence, in particular with the development of scalable algorithms for finding previously unknown relationships in data, thus centers on automatic computational methods. A proverb attributed perhaps incorrectly to Albert Einstein illustrates this perfectly: “Computers are incredibly fast, accurate, but stupid. Humans are incredibly slow, inaccurate, but brilliant. Together they may be powerful beyond imagination” [1].

An article published on February, 18, 2018 by David Shaywitz [2] from Forbes reports on the recent purchase of  the oncolology data company Flatiron Health for the enormous sum of 2,1 Billion USD (remember: Deep Mind was purchased by Google for a mere 400 million GBP 😉

This supports a few hypotheses which I try to convince my students all the time (but they won’t believe me unless Google is doing it 😉

a) those who can turn raw health data into insights and understandable knowledge can produce value
b) data – and particularly big data – is useless for the decision maker, what they need is reliable, valuable and trustworthy information
c) for the complexity of sensemaking from health data we (still) need a human-in-the-loop:  Humans (still) exceed machine performance in understanding the context and explaining the underlying explanatory factors of the data
d) consequently this is a good example for the business value of our HCI-KDD approach: Let the computer find in arbitrarily high-dimensional spaces what no human is able to do – but let the human do what no computer is able to do: BOTH working together are powerful beyond imagination!

Flatiron Health [3] is a company which is specialized on health data curation, supported by technology of course, but mostly done manually by human experts in the Mechanical Turk style. Remark: The name mechanical turk has historic origins as it was inspired by an automatic 18th-century chess-playing machine by Wolfgang von Kempelen,  that beats e.g. Benjamin Franklin in chess playing – and was acclaimed as “AI”. However, ti was later revealed that it was neither a machine nor an automatic device – in fact it was a human chess master hidden in a secret space under the chessboard and controlling the movements of an humanoid dummy. Similarly,  services which help to solve problems via human intelligence are called “Mechanical Turk online services”.

[1] Holzinger, A. 2013. Human–Computer Interaction and Knowledge Discovery (HCI-KDD): What is the benefit of bringing those two fields to work together? In: Cuzzocrea, Alfredo, Kittl, Christian, Simos, Dimitris E., Weippl, Edgar & Xu, Lida (eds.) Multidisciplinary Research and Practice for Information Systems, Springer Lecture Notes in Computer Science LNCS 8127. Heidelberg, Berlin, New York: Springer, pp. 319-328, doi:10.1007/978-3-642-40511-2_22

[2] https://www.forbes.com/sites/davidshaywitz/2018/02/18/the-deeply-human-core-of-roches-2-1b-tech-acquisition-and-why-they-did-it/#6242fdbc29c2

[3] https://flatiron.com

On-Device Machine Intelligence

One very interesting approach of federated machine learning is presented by Sujith Ravi from Google: Machine learning models (e.g. CNN) are sucessfully used for the design of intelligent systems capable of visual recognition, speech and language understanding. Most of these are running on a cloud – which is often inpredictable where it is physically running. A huge problem so far is that typical machine learning models are awkward to use on mobile devices due to both computational and memory constraints. While these devices could make use of models running on high-performance data centers with CPUs or GPUs, this is not feasible for many applications and scenarios where inference needs to be performed directly “on” device. This requires re-thinking existing machine learning algorithms and coming up with new models that are directly optimized for on-device machine intelligence rather than doing post-hoc model compression. Sujith Ravi is introducing a novel “projection-based” machine learning system for training compact neural networks. The approach uses a joint optimization framework to simultaneously train a “full” deep network and a lightweight “projection” network. Unlike the full deep network, the projection network uses random projection operations that are efficient to compute and operates in bit space yielding a low memory footprint. The system is trained end-to-end using backpropagation. Ravi shows that the approach is flexible and easily extensible to other machine learning paradigms, for example, they can learn graph-based projection models using label propagation. The trained “projection” models are then directly used for inference, please watch the origial video on:

 

Prefetching – Predicting what will be most likely needed next

A very interesting paper has just been published  about prefetching, which is a nice machine learning solution: predicting which information will be most likely useful next and consequently can be prepared in advance:

Milad Hashemi, Kevin Swersky, Jamie A Smith, Grant Ayers, Heiner Litz, Jichuan Chang, Christos Kozyrakis & Parthasarathy Ranganathan 2018. Learning Memory Access Patterns. arXiv preprint arXiv:1803.02329.

Prefetching is the process of predicting future memory accesses that will miss in the on-chip cache and access memory based on past history. Each of these memory addresses are generated by a memory instruction (a load/store). Memory instructions are a subset of all instructions that interact with
the addressable memory of the computer system.

 

There is a nice article in the MIT Technology Review by Will Knight on March, 8, 2018 on the similarities on how human improve their behaviour with age – a very nice read:

https://www.technologyreview.com/s/610453/your-next-computer-could-improve-with-age/?set=

Python in Machine Learning still Nr. 1 and increasing

There is of course no such thing like a ‘best language for machine learning’ – but as a matter of fact Python is still Nr. 1 and increasing:
Image Source: https://stackoverflow.blog/2017/09/06/incredible-growth-python/

We use in all our courses Python due to the fact that it is an “industrial standard” and widely available. I would love e.g. Julia, which is much faster, but it remains rather academic and needs a lot of additional effort. It is not astonishing that Python is worldwide the most popular tool for machine learning and artificial intelligence as there are deep learning frameworks available, including Tensor Flow, Pandas, NumPy, PyBrain, Scikit, SimpleAI, EasyAI, etc. etc.

Consequently, in our courses we teach Python, have a look at:

Marcus D. Bloice & Andreas Holzinger 2016. A Tutorial on Machine Learning and Data Science Tools with Python. In: Holzinger, Andreas (ed.) Machine Learning for Health Informatics, Lecture Notes in Artificial Intelligence LNAI 9605. Heidelberg: Springer, pp. 437-483, doi:10.1007/978-3-319-50478-0_22. [link to paper]

iML with the human-in-the-loop mentioned among 10 coolest applications of machine learning

Within the “Two Minute Papers” series, Karol Károly Zsolnai-Fehér from the Institute of Computer Graphics and Algorithms at the Vienna University of Technology mentions among “10 even cooler Deep Learning Applications” our human-in-the-loop paper:

Seid Muhie Yimam, Chris Biemann, Ljiljana Majnaric, Šefket Šabanović & Andreas Holzinger 2016. An adaptive annotation approach for biomedical entity and relation recognition. Springer/Nature: Brain Informatics, 3, (3), 157-168, doi:10.1007/s40708-016-0036-4

Watch the video here (iML is mentinoned from approx. 1:20):

Here the list of all 10 papers discussed within this 2-minutes-video

1. Geolocation – http://arxiv.org/abs/1602.05314
2. Super-resolution – http://arxiv.org/pdf/1511.04491v1.pdf
3. Neural Network visualizer – http://experiments.mostafa.io/public/…
4. Recurrent neural network for sentence completion:
5. Human-in-the-loop and Doctor-in-the-loop: https://link.springer.com/article/10.1007/s40708-016-0036-4
6. Emoji suggestions for images – https://emojini.curalate.com/
7. MNIST handwritten numbers in HD – http://blog.otoro.net/2016/04/01/generating-large-images-from-latent-vectors
8. Deep Learning solution to the Netflix prize – https://karthkk.wordpress.com/2016/03/22/deep-learning-solution-for-netflix-prize/
9. Curating works of art –
10. More robust neural networks against adversarial examples – http://cs231n.stanford.edu/reports201…
The Keras library: http://keras.io/

A) The basic principle of the iML human-in-the-loop approach:

Andreas Holzinger 2016. Interactive Machine Learning for Health Informatics: When do we need the human-in-the-loop? Brain Informatics, 3, (2), 119-131, doi:10.1007/s40708-016-0042-6

B) The entry in the GI Lexikon:
https://gi.de/informatiklexikon/interactive-machine-learning-iml

C) The experimental proof-of-concept:

Andreas Holzinger, Markus Plass, Katharina Holzinger, Gloria Cerasela Crisan, Camelia-M. Pintea & Vasile Palade 2017. A glass-box interactive machine learning approach for solving NP-hard problems with the human-in-the-loop. arXiv:1708.01104.

D) Outline and Survey of application possibilities:

Andreas Holzinger, Chris Biemann, Constantinos S. Pattichis & Douglas B. Kell 2017. What do we need to build explainable AI systems for the medical domain? arXiv:1712.09923.

Andreas Holzinger, Bernd Malle, Peter Kieseberg, Peter M. Roth, Heimo Müller, Robert Reihs & Kurt Zatloukal 2017. Towards the Augmented Pathologist: Challenges of Explainable-AI in Digital Pathology. arXiv:1712.06657.

 

NIPS-2017 Best paper “Explainability was one of the major reasons the paper was given the award”

Congratulations to Arthur GRETTON from the Gatsby Computational Neuroscience Unit at the University College London an his team. Their paper titled “A Linear-Time Kernel Goodness-of-Fit Test” authored by Wittawat JITKRITTUM, Wenkai XU, Zoltan SZABO, Kenji FUKUMIZU and Arthur GRETTON won the prestigous NIPS 2017 best paper award. In the interview by Sam Charringtion from TWiML&AI, the authors of the NIPS 2017 best paper said at 14:10 in the following video that ” … explainability was one of the reasons that the paper was given the award …”, listen here:

Here is the original talk:

Algorithms

Live from NIPS 2017, presentations from the Algorithms session:• A Linear-Time Kernel Goodness-of-Fit Test• Generalization Properties of Learning with Random Features• Communication-Efficient Distributed Learning of Discrete Distributions• Optimistic posterior sampling for reinforcement learning: worst-case regret bounds• Regret Analysis for Continuous Dueling Bandit• Minimal Exploration in Structured Stochastic Bandits• Fast Rates for Bandit Optimization with Upper-Confidence Frank-Wolfe• Diving into the shallows: a computational perspective on large-scale shallow learning• Monte-Carlo Tree Search by Best Arm Identification• A framework for Multi-A(rmed)/B(andit) Testing with Online FDR Control• Parameter-Free Online Learning via Model Selection• Bregman Divergence for Stochastic Variance Reduction: Saddle-Point and Adversarial Prediction• Gaussian Quadrature for Kernel FeaturesLearning Linear Dynamical Systems via Spectral Filtering

Posted by Neural Information Processing Systems on Dienstag, 5. Dezember 2017

 

http://papers.nips.cc/paper/6630-a-linear-time-kernel-goodness-of-fit-test

In their paper the authors propose a novel adaptive test of goodness-of-fit, with computational cost linear in the number of samples. They learn the test features, which best indicates the differences between the observed samples and a reference model, by means of minimizing the false negative rate. These features are constructed via the Stein’s method, i.e. that it is not necessary to compute the normalising constant of the model. They further analyse the asymptotic Bahadur efficiency of the new test, and prove that under a mean-shift alternative, the test always has greater relative efficiency than a previous linear-time kernel test, regardless of the choice of parameters for that particular test. In experiments, the performance of their method exceeds that of the earlier linear-time test, and matches or exceeds the power of a quadratic-time kernel test. In high dimensions and where model structure may be exploited, this new goodness of fit test performs far better than a quadratic-time two-sample test based on the Maximum Mean Discrepancy, with samples drawn from the model.

The original paper can be downloaded via the NIPS pages:
https://nips.cc/Conferences/2017/Schedule?showEvent=8823

The paper is also available at arXiv:

Jitkrittum, W., Xu, W., Szabo, Z., Fukumizu, K. & Gretton, A. 2017. A Linear-Time Kernel Goodness-of-Fit Test. arXiv preprint arXiv:1705.07673.

 

People and Artificial Intelligence Research (PAIR) Initiative

We experience enormous advances in AI and ML (see here for the difference), with impressive, daily visible improvements in technical performance, particularly in speech recognition, deep learning from images, autonomous driving, etc.

It is really great that the Google Brain team led by Jeff Dean and the Google Initiative People and Artificial Intelligence Research (PAIR) supports people-centric AI systems. They are interested in augmenting human interaction with machine intelligence and foster a humanistic approach to artificial intelligence towards making people and AI partnerships productive, enjoyable and fair.

See: https://ai.google/pair

This perfectly supports our HCI-KDD approach [1] generally, and specifically our interactive Machine Learning (iML) approach with a human in the loop [2]. The basic idea of augmenting human intelligence with artificial intelligence can foster trust [6], causal reasoning, explainability and re-traceability [5] – which is of utmost importance of the medical domain [4], [3].

[1]          Andreas Holzinger 2013. Human–Computer Interaction and Knowledge Discovery (HCI-KDD): What is the benefit of bringing those two fields to work together? In: Cuzzocrea, Alfredo, Kittl, Christian, Simos, Dimitris E., Weippl, Edgar & Xu, Lida (eds.) Multidisciplinary Research and Practice for Information Systems, Springer Lecture Notes in Computer Science LNCS 8127. Heidelberg, Berlin, New York: Springer, pp. 319-328, doi:10.1007/978-3-642-40511-2_22.

[2]          Andreas Holzinger 2016. Interactive Machine Learning for Health Informatics: When do we need the human-in-the-loop? Brain Informatics, 3, (2), 119-131, doi:10.1007/s40708-016-0042-6.

[3]          Andreas Holzinger, Chris Biemann, Constantinos S. Pattichis & Douglas B. Kell 2017. What do we need to build explainable AI systems for the medical domain? arXiv:1712.09923.

[4]          Andreas Holzinger, Bernd Malle, Peter Kieseberg, Peter M. Roth, Heimo Müller, Robert Reihs & Kurt Zatloukal 2017. Towards the Augmented Pathologist: Challenges of Explainable-AI in Digital Pathology. arXiv:1712.06657.

[5]          Andreas Holzinger, Markus Plass, Katharina Holzinger, Gloria Cerasela Crisan, Camelia-M. Pintea & Vasile Palade 2017. A glass-box interactive machine learning approach for solving NP-hard problems with the human-in-the-loop. arXiv:1708.01104.

[6]          Katharina Holzinger, Klaus Mak, Peter Kieseberg & Andreas Holzinger 2018. Can we trust Machine Learning Results? Artificial Intelligence in Safety-Critical decision Support. ERCIM News, 112, (1), 42-43.

 

What is the difference between AI and ML?

What is the difference between Artificial Intelligence and Machine Learning?

My students repeatedly ask the question: “What is the difference between Artificial Intelligence (AI) and Machine Learning (ML) – and is deep learning (DL) belonging to either AI or ML?”. In the following I provide a I) brief answer, a II) formal short answer and III) a more elaborated answer:

I) Brief answer: It is the same and it is different. Deep Learning can also belong to both, and: both are necessary. This explains well why HCI-KDD is so enormously important: Human–Computer Interaction (HCI) deals mainly with aspects of human perception, human cognition, human intelligence, sense-making and the interaction between human and machine. Knowledge Discovery from Data (KDD), deals mainly with machine intelligence, and with the development of algorithms for automatic and interactive data mining [1].

II) A formal short answer:

Deep Learning is part of  Machine Learning  is part of Artificial Intelligence

DL  \subset ML  \subset AI

This follows the popular Deep Learning book by Ian Goodfellow, Yoshua Bengio and Aaron Courville published by MIT Press 2016 [2]:
http://www.deeplearningbook.org/contents/intro.html

and here is the explanation:

III) A more elaborated answer:

Artificial Intelligence (AI) is the field working on understanding intelligence. The motto of Google Deep Mind is “understand intelligence – then understand everything else” (Demis HASSABIS). Consequently the study of human intelligence is of utmost importance for understanding machine intelligence. The long-term goal of AI is in general intelligence (“strong AI”). AI has a strong connection to cognitive science and is a very old scientific field. After a first hype between 1950 and 1980 and a following AI-winter, it has regained hype status because of the practical success made by machine learning and particularly by the success of deep learning very recently (although going back to the early days of AI, e.g. [3]. Recently the DARPA described it well (DARPA Perspective on Artificial Intelligence by John LAUNCHBURY – excellent video, I highly recommend my students to watch it:

According to DARPA there are three waves of AI:

The first wave as kind of a programmed ability to process information, i.e. engineers handcrafted a set of rules to represent knowledge in (narrowly) well-defined domains. The structure of this knowledge is defined by human experts and specifics in the domain are explored by computers.

The second wave of AI is the success of statistical/probabilistic learning, i.e. engineers create statistical models for specific problem domains and train them, preferably on very big data sets. (BTW: John LAUNCHBURY emphasizes the importance of geometric models for machine learning, e.g. manifolds in topological data analysis – exactly what we foster, see CD-MAKE Topology) and this beautiful recent article by Massimo FERRI. Currently neural networks (deep learning) show tremendously interesting successes (see e.g. a recent work from our own group [4]. The future third wave will have to focus on explainable ai, i.e. contextual adaptation, and make models able to explain how an algorithms came to a decision (see my post on transparceny and trust in machine learning and our recent paper [7], and see our iML project page). In essence ALL three waves are necessary in the future and the combination of various methods promise success!

Machine Learning (ML) is a very practical field and deals with applying artificial intelligence for the design and development of algorithms that can learn from data, to gain knowledge from experience and improve their learning behaviour over time – for more details please refer to [5].

Whilst AI is the broader fundament and encompasses all underlying scientific theories of human learning vs. machine learning, ML itself is a very practical field with uncountable practical applications – the introduction by Sebastian Thrun (Stanford) and Katie Malone (Moderator at Linear Digressions) brings this beautiful to the point and makes it important how important machine learning is for business:

 

Deep Learning (DL) is one methodological family of ML based on, e.g. artificial neural networks (ANN), deep belief networks, recurrent neural networks, or to give a precise example of a feed forward ANN: the multilayer perceptron (MLP), which is a very simple mathematical function mapping a set of input data to output data. The concept behind is representations learning by introducing other representations that are expressed in terms of simpler representations. Maybe, this is how our brain works [6], but we do not know yet.

A nice example is the recognition of a cat (at 2m11s):

However, this immediately let us understand the huge shortcomings of these approaches: While these algorithms nicely recognize a cat, they cannot explain why it is a cat. The algorithm is unable to explain why it come to this conclusion. Consequently, the next level of machine learning and artificial intelligence is in explainable AI, see transparency.

A final note to my students: computational intelligence (you may call it either Artificial Intelligence (AI) or Machine Learning (ML) may help to solve problems, particularly in areas where humans have limited capacities (e.g. in high dimensional spaces, large numbers, big data, etc.); however, we must acknowledge that the problem-solving capacity of the human mind is still unbeaten in certain aspects (e.g. in the lower dimensions, little data, complex problems, etc.). A strategic aim to find solutions for data intensive problems is effectively the combination of our two areas: Human–Computer Interaction (HCI) and Knowledge Discovery (KDD).

A proverb attributed perhaps incorrectly to Albert Einstein (many proverbs are attributed to famous persons to make them appealing) illustrates this perfectly: “Computers are incredibly fast, accurate, but stupid. Humans are incredibly slow, inaccurate, but brilliant. Together they may be powerful beyond imagination”. Consequently, the novel approach to combine HCI & KDD in order to enhance human intelligence by computational intelligence fits perfectly to AI and ML together [1].

References:

[1]          Holzinger, A. 2013. Human–Computer Interaction and Knowledge Discovery (HCI-KDD): What is the benefit of bringing those two fields to work together? In: Cuzzocrea, Alfredo, Kittl, Christian, Simos, Dimitris E., Weippl, Edgar & Xu, Lida (eds.) Multidisciplinary Research and Practice for Information Systems, Springer Lecture Notes in Computer Science LNCS 8127. Heidelberg, Berlin, New York: Springer, pp. 319-328, doi:10.1007/978-3-642-40511-2_22.

[2]          Goodfellow, I., Bengio, Y. & Courville, A. 2016. Deep Learning, Cambridge (MA), MIT Press.

[3]          Mcculloch, W. S. & Pitts, W. 1943. A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biology, 5, (4), 115-133, doi:10.1007/BF02459570.

[4]          Singh, D., Merdivan, E., Psychoula, I., Kropf, J., Hanke, S., Geist, M. & Holzinger, A. 2017. Human Activity Recognition Using Recurrent Neural Networks. In: Holzinger, Andreas, Kieseberg, Peter, Tjoa, A. Min & Weippl, Edgar (eds.) Machine Learning and Knowledge Extraction: Lecture Notes in Computer Science LNCS 10410. Cham: Springer International Publishing, pp. 267-274, doi:10.1007/978-3-319-66808-6_18.

[5]          Holzinger, A. 2017. Introduction to Machine Learning and Knowledge Extraction (MAKE). Machine Learning and Knowledge Extraction, 1, (1), 1-20, doi:10.3390/make1010001.

[6]          Hinton, G. E. & Shallice, T. 1991. Lesioning an attractor network: Investigations of acquired dyslexia. Psychological review, 98, (1), 74.

[7]   Holzinger, A., Plass, M., Holzinger, K., Crisan, G.C., Pintea, C.-M. & Palade, V. 2017. A glass-box interactive machine learning approach for solving NP-hard problems with the human-in-the-loop. arXiv:1708.01104.

Digital Pathology: The world’s fastest whole slide scanner is now working in Graz

26.10.2017. Today, Prof. Kurt Zatloukal and his group together with the digital pathology team of 3DHISTECH, our industrial partner, completed the installation of the new generation panoramic P1000 scanner. The world’s fastest whole slide image scanner (WSI) is now located in Graz. The current scanner outperforms current state-of-the-art systems by a factor 6, which provides enormous opportunities for our MAKEpatho project.

Digital Pathology and Artificial Intelligence/Machine Learning

Digital pathology [1] is not just the transformation of the classical microscopic analysis of histopathological slides by pathologists to a digital visualization. Digital Pathology is an innovation that will dramatically change medical workflows in the coming years. In the center is Whole Slide Imaging (WSI), but the true added value will result from a combination of heterogenous data sources. This will generate a new kind of information not yet available today. Much information is hidden in arbitrarily high dimensional spaces and not accessible to a human. Consequently, we need novel approaches from artificial intelligence (AI) and machine learning (ML) (see definition) in Digital Pathology [2]. The goal is to gain knowledge from this information, which is not yet available and not exploited to date [3].

Digital Pathology chances

Major changes enabled by digital pathology include the improvement of medical decision making, new chances for education and research, and the globalization of diagnostic services. The latter allows bringing the top-level expertise essentially to any patient in the world by the use of the Internet/Web. This will also generate totally new business models for  worldwide diagnostic services. Furthermore, by using AI/ML we can make new information of images accessible and quantifiable (e.g. through geometrical approaches and machine learning),  which is not yet available in current diagnostics. Another effect will be that digital pathology and machine learning will change the education and training systems, which will be an urgently needed solution to address the global shortage of medical specialists. While the digitalization is called Pathology 2.0 [4] we envision a Pathology 4.0 – and here explainable-AI will become important.

3DHISTECH

3DHISTECH Ltd. (the name is derived from „Three-dimensional Histological Technologies”) is a leading company, developing high-performance hardware and software products for digital pathology since 1996. As the first European manufacturer, 3DHISTECH is one of the market leaders in the world with more than 1500 sold systems. Being one of the pioneers in this field, 3DHISTECH develops and manufactures high speed digital slide scanners that create high quality bright field and fluorescent digital slides, digital histology software and tissue microarray machinery. 3DHISTECH’s aim is to fully digitalize the traditional pathology workflow so that it can adapt to the ever growing demands of healthcare today. Furthermore, educational programs are also organized to help pathologists learn and master these new technologies easier.

[1]  Shaimaa Al‐Janabi, Andre Huisman & Paul J. Van Diest (2012). Digital pathology: current status and future perspectives. Histopathology, 61, (1), 1-9, doi:10.1111/j.1365-2559.2011.03814.x.

[2] Anant Madabhushi & George Lee (2016). Image analysis and machine learning in digital pathology: Challenges and opportunities. Medical Image Analysis, 33, 170-175, doi:10.1016/j.media.2016.06.037.

[3]  Andreas Holzinger, Bernd Malle, Peter Kieseberg, Peter M. Roth, Heimo Müller, Robert Reihs & Kurt Zatloukal (2017). Machine Learning and Knowledge Extraction in Digital Pathology needs an integrative approach. In: Springer Lecture Notes in Artificial Intelligence Volume LNAI 10344. Cham: Springer International, pp. 13-50. 10.1007/978-3-319-69775-8_2  [pdf-preprint available here]

[4]  Nikolas Stathonikos, Mitko Veta, André Huisman & Paul J Van Diest (2013). Going fully digital: Perspective of a Dutch academic pathology lab. Journal of pathology informatics, 4. doi:  10.4103/2153-3539.114206

[5] Francesca Demichelis, Mattia Barbareschi, P Dalla Palma & S Forti 2002. The virtual case: a new method to completely digitize cytological and histological slides. Virchows Archiv, 441, (2), 159-164. https://doi.org/10.1007/s00428-001-0561-1

[6] Marcus Bloice, Klaus-Martin Simonic & Andreas Holzinger 2013. On the usage of health records for the design of virtual patients: a systematic review. BMC Medical Informatics and Decision Making, 13, (1), 103, doi:10.1186/1472-6947-13-103.

[7] http://www.3dhistech.com

[8]  http://pathologie.medunigraz.at/forschung/forschungslabor-fuer-experimentelle-zellforschung-und-onkologie

Mini Glossary:

Digital Pathology = is not only the conversion of histopathological slides into a digital image (WSI) that can be uploaded to a computer for storage and viewing, but a complete new medical work procedure (from Pathology 2.0 to Pathology 4.0) – the basis is Virtual Microscopy.

Explainability = motivated due to lacking transparency of black-box approaches, which do not foster trust and acceptance of AI generally and ML specifically among end-users. Rising legal and privacy aspects, e.g. with the new European General Data Protection Regulations (which come into effect in May 2018) will make black-box approaches difficult to use, because they often are not able to explain why a decision has been made (see explainable AI).

Explainable AI = raising legal and ethical aspects make it mandatory to enable a human to understand why a machine decision has been made, i.e. to make machine decisions re-traceable and to explain why a decision has been made [see Wikipedia on Explainable Artificial Intelligence] (Note: that does not mean that it is always necessary to explain everything and all – but to be able to explain it if necessary – e.g. for general understanding, for teaching, for learning, for research – or in court!)

Machine Aided Pathology = is the management, discovery and extraction of knowledge from a virtual case, driven by advances of digital pathology supported by feature detection and classification algorithms.

Virtual Case = the set of all histopathological slides of a case together with meta data from the macro pathological diagnosis [5]

Virtual microscopy = not only viewing of slides on a computer screen over a network, it can be enhanced by supporting the pathologist with equivalent optical resolution and magnification of a microscope whilst changing  the magnification; machine learning and ai methods can help to extract new knowlege out of the image data

Virtual Patient = has very different definitions (see [6]), we define it as a model of electronic records (images, reports, *omics) for studying e.g. diseases.

WSI = Whole Slide Image, a.k.a. virtual slide, is a digitized histopathology glass slide that has been created on a slide scanner and represents a high-resolution volume data cube which can be handled via a virtual microscope and most of all where methods from artificial intelligence generally, and interactive machine learning specifically, together with methods from topological data analysis, can make information accessible to a human pathologists, which would otherwise be hidden.

WSS = Whole Slide Scanner is the machinery for taking WSI including the hardware and the software for creating a WSI.

Transparency & Trust in Machine Learning: Making AI interpretable and explainable

A huge motivation for us in continuing to study interactive Machine Learning (iML) [1] – with a human in the loop [2] (see our project page) is that modern deep learning models are often considered to be “black-boxes” [3]. A further drawback is that such models have no explicit declarative knowledge representation, hence have difficulty in generating the required explanatory structures – which considerably limits the achievement of their full potential [4].

Even if we understand the mathematical theories behind the machine model it is still complicated to get insight into the internal working of that model, hence black box models are lacking transparency, consequently we raise the question: “Can we trust our results?”

In fact: “Can we explain how and why a result was achieved?” A classic example is the question “Which objects are similar?”, but an even more interesting question would be to answer “Why are those objects similar?”

We believe that there is growing demand in machine learning approaches, which are not only well performing, but transparent, interpretable and trustworthy. We are currently working on methods and models to reenact the machine decision-making process, to reproduce and to comprehend the learning and knowledge extraction process. This is important, because for decision support it is necessary to understand the causality of learned representations [5], [6]. If human intelligence is complemented by machine learning and at least in some cases even overruled, humans must still be able to understand, and most of all to be able to interactively influence the machine decision process. This needs context awareness and sensemaking to close the gap between human thinking and machine “thinking”.

A huge motivation for this approach are rising legal and privacy aspects, e.g. with the new European General Data Protection Regulation (GDPR and ISO/IEC 27001) entering into force on May, 25, 2018, will make black-box approaches difficult to use in business, because they are not able to explain why a decision has been made.

This will stimulate research in this area with the goal of making decisions interpretable, comprehensible and reproducible. On the example of health informatics this is not only useful for machine learning research, and for clinical decision making, but at the same time a big asset for the training of medical students.

The General Data Protection Regulation (GDPR) (Regulation (EU) 2016/679) is a regulation by which the European Parliament, the Council of the European Union and the European Commission intend to strengthen and unify data protection for all individuals within the European Union (EU). It also addresses the export of personal data outside of the European Union (this will affect data-centric projects between the EU and e.g. the US). The GDPR aims primarily to give control back to citizens and residents over their personal data and to simplify the regulatory environment for international business by unifying the regulation within the EU. The GDPR replaces the data protection Directive 95/46/EC) of 1995. The regulation was adopted on 27 April 2016 and becomes enforceable from 25 May 2018 after now a two-year transition period and, unlike a directive, it does not require national governments to pass any enabling legislation, and is thus directly binding – which affects practically all data-driven businesses and particularly machine learning and AI technology Here to note is that the “right to be forgotten” [7] established by the European Court of Justice has been extended to become a “right of erasure”; it will no longer be sufficient to remove a person’s data from search results when requested to do so, data controllers must now erase that data. However, if the data is encrypted, it may be sufficient to destroy the encryption keys rather than go through the prolonged process of ensuring that the data has been fully erased [8].

A recent, and very interesting discussion with Daniel S. WELD (Artificial Intelligence, Crowdsourcing, Information Extraction) on Explainable AI can be found here:

The interview in essence brings out that most machine learning models are very complicated: deep neural networks operate incredibly quickly, considering thousands of possibilities in seconds before making decisions and Dan Weld points out: “The human brain simply can’t keep up” – and pointed at the example when AlphaGo made an unexpected decision: It is not possible to understand why the algorithm made exactly that choice. Of course this may not be critical in a game – no one gets hurt; however, deploying intelligent machines that we can not understand could set a dangerous precedent in e.g. in our domain: health informatics. According to Dan Weld, understanding and trusting machines is “the key problem to solve” in AI safety, security, data protection and privacy, and it is urgently necessary. He further explains, “Since machine learning is nowadays at the core of pretty much every AI success story, it’s really important for us to be able to understand what is it that the machine learned.” In case a machine learning system is confronted with a “known unknown,” it may recognize its uncertainty with the situation in the given context. However, when it encounters an unknown unknown, it won’t even recognize that this is an uncertain situation: the system will have extremely high confidence that its result is correct – but it still will be wrong, and Dan pointed on the example of classifiers “trained on data that had some regularity in it that’s not reflected in the real world” – which is a problem of having little data or even no available training data (see [1]) – the problem of “unknown unknowns” is definitely underestimated in the traditional AI community. Governments and businesses can’t afford to deploy highly intelligent AI systems that make unexpected, harmful decisions, especially if these systems are in safety critical environments.

 

References:

[1]          Holzinger, A. 2016. Interactive Machine Learning for Health Informatics: When do we need the human-in-the-loop? Brain Informatics, 3, (2), 119-131, doi:10.1007/s40708-016-0042-6.

[2]          Holzinger, A., Plass, M., Holzinger, K., Crisan, G. C., Pintea, C.-M. & Palade, V. 2017. A glass-box interactive machine learning approach for solving NP-hard problems with the human-in-the-loop. arXiv:1708.01104.

[3]          Lipton, Z. C. 2016. The mythos of model interpretability. arXiv preprint arXiv:1606.03490.

[4]          Bologna, G. & Hayashi, Y. 2017. Characterization of Symbolic Rules Embedded in Deep DIMLP Networks: A Challenge to Transparency of Deep Learning. Journal of Artificial Intelligence and Soft Computing Research, 7, (4), 265-286, doi:10.1515/jaiscr-2017-0019.

[5]          Pearl, J. 2009. Causality: Models, Reasoning, and Inference (2nd Edition), Cambridge, Cambridge University Press.

[6]          Gershman, S. J., Horvitz, E. J. & Tenenbaum, J. B. 2015. Computational rationality: A converging paradigm for intelligence in brains, minds, and machines. Science, 349, (6245), 273-278, doi:10.1126/science.aac6076.

[7]          Malle, B., Kieseberg, P., Schrittwieser, S. & Holzinger, A. 2016. Privacy Aware Machine Learning and the “Right to be Forgotten”. ERCIM News (special theme: machine learning), 107, (3), 22-23.

[8]          Kingston, J. 2017. Using artificial intelligence to support compliance with the general data protection regulation. Artificial Intelligence and Law, doi:10.1007/s10506-017-9206-9.

Links:

https://de.wikipedia.org/wiki/Datenschutz-Grundverordnung

https://en.wikipedia.org/wiki/General_Data_Protection_Regulation

http://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:31995L0046

2016 ICML Workshop on Human Interpretability in Machine Learning (WHI 2016), New York, NY

http://googleblog.blogspot.com/2015/07/neon-prescription-or-rather-new.html

https://sites.google.com/site/nips2016interpretml

 

Interpretable Machine Learning Workshop

Andrew G Wilson, Jason Yosinski, Patrice Simard, Rich Caruana, William Herlands

https://nips.cc/Conferences/2017/Schedule?showEvent=8744

 

Journal “Artificial Intelligence and Law”

https://link.springer.com/journal/volumesAndIssues/10506

ISSN: 0924-8463 (Print) 1572-8382 (Online)

Mini Glossary:

AI = Artificial Intelligence, today interchangeably used together with Machine learning (ML) – those are highly interrelated but not the same

Causality = extends from Greek philosophy to todays neuropsychology; assumptions about the nature of causality may be shown to be functions of a previous event preceding a later event. A relevant reading on this is by Judea Pearl (2000 and 2009)

Explainability = upcoming fundamental topic within recent AI; answering e.g. why a decision has been made

Etiology = in medicine (many) factors coming together to cause an illness (see causality)

Interpretability = there is no formal technical definition yet, but it is considered as a prerequisite for trust

Transparency = opposite of opacity of black-box approaches, and connotes the ability to understand how a model works (that does not mean that it should always be understood, but that – in the case of necessity – it can be re-enacted