Posts

August 25-28, 2020, Machine Learning & Knowledge Extraction, LNCS 12279 published !

Our Lecture Notes in Computer Sciene LNCS 12279 of our CD-MAKE Machine Learning & Knowledge Extraction conference have been published

https://link.springer.com/book/10.1007/978-3-030-57321-8

and are available online:

Content at a glance:

Explainable Artificial Intelligence: Concepts, Applications, Research
Challenges and Visions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Luca Longo, Randy Goebel, Freddy Lecue, Peter Kieseberg,
and Andreas Holzinger
The Explanation Game: Explaining Machine Learning Models
Using Shapley Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Luke Merrick and Ankur Taly
Back to the Feature: A Neural-Symbolic Perspective on Explainable AI. . . . . 39
Andrea Campagner and Federico Cabitza
Explain Graph Neural Networks to Understand Weighted Graph
Features in Node Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Xiaoxiao Li and João Saúde
Explainable Reinforcement Learning: A Survey . . . . . . . . . . . . . . . . . . . . . 77
Erika Puiutta and Eric M. S. P. Veith
A Projected Stochastic Gradient Algorithm for Estimating Shapley Value
Applied in Attribute Importance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Grah Simon and Thouvenot Vincent
Explaining Predictive Models with Mixed Features Using Shapley Values
and Conditional Inference Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Annabelle Redelmeier, Martin Jullum, and Kjersti Aas
Explainable Deep Learning for Fault Prognostics in Complex Systems:
A Particle Accelerator Use-Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Lukas Felsberger, Andrea Apollonio, Thomas Cartier-Michaud,
Andreas M
üller, Benjamin Todd, and Dieter Kranzlmüller
eXDiL: A Tool for Classifying and eXplaining Hospital Discharge Letters. . . 159
Fabio Mercorio, Mario Mezzanzanica, and Andrea Seveso
Cooperation Between Data Analysts and Medical Experts: A Case Study. . . . 173
Judita Rokošná, František Babič, Ljiljana Trtica Majnarić,
and L
udmila Pusztová
A Study on the Fusion of Pixels and Patient Metadata in CNN-Based
Classification of Skin Lesion Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
Fabrizio Nunnari, Chirag Bhuvaneshwara,
Abraham Obinwanne Ezema, and Daniel Sonntag
The European Legal Framework for Medical AI . . . . . . . . . . . . . . . . . . . . . 209
David Schneeberger, Karl Stöger, and Andreas Holzinger
An Efficient Method for Mining Informative Association Rules
in Knowledge Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Parfait Bemarisika and André Totohasina
Interpretation of SVM Using Data Mining Technique to Extract Syllogistic
Rules: Exploring the Notion of Explainable AI in Diagnosing CAD . . . . . . . 249
Sanjay Sekar Samuel, Nik Nailah Binti Abdullah, and Anil Raj
Non-local Second-Order Attention Network for Single Image
Super Resolution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Jiawen Lyn and Sen Yan
ML-ModelExplorer: An Explorative Model-Agnostic Approach to Evaluate
and Compare Multi-class Classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
Andreas Theissler, Simon Vollert, Patrick Benz, Laurentius A. Meerhoff,
and Marc Fernandes
Subverting Network Intrusion Detection: Crafting Adversarial Examples
Accounting for Domain-Specific Constraints. . . . . . . . . . . . . . . . . . . . . . . . 301
Martin Teuffenbach, Ewa Piatkowska, and Paul Smith
Scenario-Based Requirements Elicitation for User-Centric Explainable AI:
A Case in Fraud Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
Douglas Cirqueira, Dietmar Nedbal, Markus Helfert,
and Marija Bezbradica
On-the-fly Black-Box Probably Approximately Correct Checking
of Recurrent Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
Franz Mayr, Ramiro Visca, and Sergio Yovine
Active Learning for Auditory Hierarchy. . . . . . . . . . . . . . . . . . . . . . . . . . . 365
William Coleman, Charlie Cullen, Ming Yan, and Sarah Jane Delany
Improving Short Text Classification Through Global
Augmentation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
Vukosi Marivate and Tshephisho Sefara
Interpretable Topic Extraction and Word Embedding Learning
Using Row-Stochastic DEDICOM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401
Lars Hillebrand, David Biesner, Christian Bauckhage,
and Rafet Sifa
A Clustering Backed Deep Learning Approach for Document
Layout Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
Rhys Agombar, Max Luebbering, and Rafet Sifa
Calibrating Human-AI Collaboration: Impact of Risk, Ambiguity
and Transparency on Algorithmic Bias . . . . . . . . . . . . . . . . . . . . . . . . . . . 431
Philipp Schmidt and Felix Biessmann
Applying AI in Practice: Key Challenges and Lessons Learned. . . . . . . . . . . 451
Lukas Fischer, Lisa Ehrlinger, Verena Geist, Rudolf Ramler,
Florian Sobieczky, Werner Zellinger, and Bernhard Moser
Function Space Pooling for Graph Convolutional Networks . . . . . . . . . . . . . 473
Padraig Corcoran
Analysis of Optical Brain Signals Using Connectivity Graph Networks . . . . . 485
Marco Antonio Pinto-Orellana and Hugo L. Hammer
Property-Based Testing for Parameter Learning of Probabilistic
Graphical Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499
Anna Saranti, Behnam Taraghi, Martin Ebner, and Andreas Holzinger
An Ensemble Interpretable Machine Learning Scheme for Securing
Data Quality at the Edge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
Anna Karanika, Panagiotis Oikonomou, Kostas Kolomvatsos,
and Christos Anagnostopoulos
Inter-space Machine Learning in Smart Environments . . . . . . . . . . . . . . . . . 535
Amin Anjomshoaa and Edward Curry

The International Cross Domain Conference for MAchine Learning & Knowledge Extraction (CD-MAKE) is a joint effort of IFIP TC 5 (IT), TC 12 (Artificial Intelligence), IFIP WG 8.4 (E-Business), IFIP WG 8.9 (Information Systems), and IFIP WG 12.9 (Computational Intelligence) and is held in conjunction with the International Conference on Availability, Reliability and Security (ARES), see: 

https://www.ares-conference.eu/

The 4th conference is organized at the University College Dublin, Ireland and held as a virtual event, due to the Corona pandemic. A few words about the International Federation for Information Processing (IFIP):

IFIP is the leading multi-national, non-governmental, apolitical organization in Information and Communications Technologies and Computer Sciences, is recognized by the United Nations (UN), and was established in the year 1960 under the auspices of the UNESCO as an outcome of the first World Computer Congress held in Paris in
1959.

 

A good proof of the importance of the HCI-KDD approach, worth: 2,1 Billion USD !

Our strategic aim is to find solutions for data intensive problems by the combination of two areas, which bring ideal pre-conditions towards understanding intelligence and to bring business value in AI: Human-Computer Interaction (HCI) and Knowledge Discovery (KDD). HCI deals with questions of human intelligence, whereas KDD deals with questions of artificial intelligence, in particular with the development of scalable algorithms for finding previously unknown relationships in data, thus centers on automatic computational methods. A proverb attributed perhaps incorrectly to Albert Einstein illustrates this perfectly: “Computers are incredibly fast, accurate, but stupid. Humans are incredibly slow, inaccurate, but brilliant. Together they may be powerful beyond imagination” [1].

An article published on February, 18, 2018 by David Shaywitz [2] from Forbes reports on the recent purchase of  the oncolology data company Flatiron Health for the enormous sum of 2,1 Billion USD (remember: Deep Mind was purchased by Google for a mere 400 million GBP 😉

This supports a few hypotheses which I try to convince my students all the time (but they won’t believe me unless Google is doing it 😉

a) those who can turn raw health data into insights and understandable knowledge can produce value
b) data – and particularly big data – is useless for the decision maker, what they need is reliable, valuable and trustworthy information
c) for the complexity of sensemaking from health data we (still) need a human-in-the-loop:  Humans (still) exceed machine performance in understanding the context and explaining the underlying explanatory factors of the data
d) consequently this is a good example for the business value of our HCI-KDD approach: Let the computer find in arbitrarily high-dimensional spaces what no human is able to do – but let the human do what no computer is able to do: BOTH working together are powerful beyond imagination!

Flatiron Health [3] is a company which is specialized on health data curation, supported by technology of course, but mostly done manually by human experts in the Mechanical Turk style. Remark: The name mechanical turk has historic origins as it was inspired by an automatic 18th-century chess-playing machine by Wolfgang von Kempelen,  that beats e.g. Benjamin Franklin in chess playing – and was acclaimed as “AI”. However, ti was later revealed that it was neither a machine nor an automatic device – in fact it was a human chess master hidden in a secret space under the chessboard and controlling the movements of an humanoid dummy. Similarly,  services which help to solve problems via human intelligence are called “Mechanical Turk online services”.

[1] Holzinger, A. 2013. Human–Computer Interaction and Knowledge Discovery (HCI-KDD): What is the benefit of bringing those two fields to work together? In: Cuzzocrea, Alfredo, Kittl, Christian, Simos, Dimitris E., Weippl, Edgar & Xu, Lida (eds.) Multidisciplinary Research and Practice for Information Systems, Springer Lecture Notes in Computer Science LNCS 8127. Heidelberg, Berlin, New York: Springer, pp. 319-328, doi:10.1007/978-3-642-40511-2_22

[2] https://www.forbes.com/sites/davidshaywitz/2018/02/18/the-deeply-human-core-of-roches-2-1b-tech-acquisition-and-why-they-did-it/#6242fdbc29c2

[3] https://flatiron.com

CD-MAKE machine learning and knowledge extraction

Marta Milo and Neil Lawrence in Reggio di Calabria at CD-MAKE 2017

The CD-MAKE 2017 in the context of the ARES conference series was a full success in beautiful Reggio di Calabria.

In the middle Marta Milo and Neil Lawrence the keynote speakers of CD-MAKE 2017, flanked by Francesco Buccafurri (on the right) and Andreas Holzinger

Machine Learning & Knowledge Extraction (MAKE) Journal launched

Inaugural Editorial Paper published:

Holzinger, A. 2017. Introduction to Machine Learning & Knowledge Extraction (MAKE). Machine Learning and Knowledge Extraction, 1, (1), 1-20, doi:10.3390/make1010001.

https://www.mdpi.com/2504-4990/1/1/1

Machine Learning and Knowledge Extraction (MAKE) is an inter-disciplinary, cross-domain, peer-reviewed, scholarly open access journal to provide a platform to support the international machine learning community. It publishes original research articles, reviews, tutorials, research ideas, short notes and Special Issues that focus on machine learning and applications. Papers which deal with fundamental research questions to help reach a level of useable computational intelligence are very welcome.

Machine learning deals with understanding intelligence to design algorithms that can learn from data, gain knowledge from experience and improve their learning behaviour over time. The challenge is to extract relevant structural and/or temporal patterns (“knowledge”) from data, which is often hidden in high dimensional spaces,  thus not accessible to humans. Many application
domains, e.g., smart health, smart factory, etc. affect our daily life, e.g., recommender systems, speech recognition, autonomous driving, etc. The grand challenge is to understand the context in the real-world under uncertainty. Probabilistic inference can be of
great help here as the inverse probability allows to learn from data, to infer unknowns, and to make predictions to support decision making.

NOTE: To support the training of a new kind of machine learning graduates, the journal accepts peer-reviewed high-end tutorial papers, similar as the IEEE Signal Processing Magazine (SCI IF=9.654 !) is doing:
https://ieeexplore.ieee.org/xpl/aboutJournal.jsp?punumber=79#AimsScope

Call for Papers: Open Data for Discovery Science (due to July, 31, 2017)

The Journal BMC Medical Informatics and Decision Making (SCI IF (2015): 2,042)
invites to submit to a new thematic series on open data for discovery science

https://bmcmedinformdecismak.biomedcentral.com/articles/collections/odds

Note: Excellent submissions to the IFIP Cross Domain Conference on Machine Learning and Knowledge Discovery (CD-MAKE), (Submission due to May, 15, 2017) relevant to the topics described below, will be invited to expand their work into this thematic series:
The use of open data for discovery science has gained much attention recently as its full potential is unfolding and being explored in projects spanning all areas of healthcare research. A plethora of data sets are now available thanks to drives to make data universally accessible and usable for discovery science. However, with these advances come inherent challenges with the processing and management of ever expanding data sources. The computational and informatics tools and methods currently used in most investigational settings are often labor intensive and rely upon technologies that have not been designed to scale and support reasoning across multi-dimensional data resources. In addition, there are many challenges associated with the storage and responsible use of open data, particularly medical data, such as privacy, data protection, safety, information security and fair use of the data. There are therefore significant demands from the research community for the development of data management and analytic tools supporting heterogeneous analytic workflows and open data sources. Effective anonymisation tools are also of paramount importance to protect data security whilst preserving the usability of the data.

The purpose of this thematic series is to bring together articles reporting advances in the use of open data including the following:

  • The development of tools and methods targeting the reproducible and rigorous use of open data for discovery science, including but not limited to: syntactic and semantic standards, platforms for data sharing and discovery, and computational workflow orchestration technologies that enable the creation of data analytics, machine learning and knowledge extraction pipelines.
  • Practical approaches for the automated and/or semi-automated harmonization, integration, analysis, and presentation of data products to enable hypothesis discovery or testing.
  • Theoretical and practical approaches for solutions to make use of interactive machine learning to put a human-in-the-loop, answering questions including: could human intelligence lead to general heuristics that we can use to improve heuristics?
  • Frameworks for the application of open data in hypothesis generation and testing in projects spanning translational, clinical, and population health research.
  • Applied studies that demonstrate the value of using open data either as a primary or as an enriching source of information for the purposes of hypothesis generation/testing or for data-driven decision making in the research, clinical, and/or population health environments.
  • Privacy preserving machine learning and knowledge extraction algorithms that can enable the sharing of previously “privileged” data types as open data.
  • Evaluation and benchmarking methodologies, methods and tools that can be used to demonstrate the impact of results generated through the primary or secondary use of open data.
  • Socio-cultural, usability, acceptance, ethical and policy issues and frameworks relevant to the sharing, use, and dissemination of information and knowledge derived from the analysis of open data.

Submission is open to everyone, and all submitted manuscripts will be peer-reviewed through the standard BMC Medical Informatics and Decision Making review process. Manuscripts should be formatted according to the submission guidelines and submitted via the online submission system. Please indicate clearly in the covering letter that the manuscript is to be considered for the ‘Open data for discovery science’ collection. The deadline for submissions will be 31 July 2017.

For further information, please email the editors of the thematic series:
Andreas HOLZINGER a.holzinger@human-centered.ai,
Philip PAYNE prpayne@wustl.edu ,or the BMC in-house editor
Emma COOKSON at emma.cookson@biomedcentral.com

Link to the IFIP Cross-Domain Conference on Machine Learning and Knowledge Extraction (CD-MAKE):
https://cd-make.net

Integrated interactomes and pathways in precision medicine by Igor Jurisica, Toronto

Machine learning is the fastest growing field in computer science, and Health Informatics is amongst the greatest application challenges, providing benefits in improved medical diagnoses, disease analyses, and pharmaceutical development – towards future precision medicine.

Talk announcement: Friday, 12th May, 2017, 10:00, Seminaraum 137, Parterre, Inffeldgasse 16c

Integrated interactomes and pathways in precision medicine

by Igor Jurisica, University of Toronto and Princess Margaret Cancer Center Toronto

Abstract: Fathoming cancer and other complex disease development processes requires systematically integrating diverse types of information, including multiple high-throughput datasets and diverse annotations. This comprehensive and integrative analysis will lead to data-driven precision medicine, and in turn will help us to develop new hypotheses, and answer complex questions such as what factors cause disease; which patients are at high risk; will patients respond to a given treatment; how to rationally select a combination therapy to individual patient, etc.
Thousands of potentially important proteins remain poorly characterized. Computational biology methods, including machine learning, knowledge extraction, data mining and visualization, can help to fill this gap with accurate predictions, making disease modeling more comprehensive. Intertwining computational prediction and modeling with biological experiments will lead to more useful findings faster and more economically.

Short Bio: Igor Jurisica is Tier I Canada Research Chair in Integrative Cancer Informatics, Senior Scientist at Princess Margaret Cancer Centre, Professor at University of Toronto and Visiting Scientist at IBM CAS. He is also an Adjunct Professor at the School of Computing, Pathology and Molecular Medicine at Queen’s University, Computer Science at York University, scientist at the Institute of Neuroimmunology, Slovak Academy of Sciences and an Honorary Professor at Shanghai Jiao Tong University in China. Since 2015, he has also served as Chief Scientist at the Creative Destruction Lab, Rotman School of Management. Igor has published extensively on data mining, visualization and cancer informatics, including multiple papers in Science, Nature, Nature Medicine, Nature Methods, Journal of Clinical Oncology, and received over 9,960 citations since 2012. He has been included in Thomson Reuters 2016, 2015 & 2014 list of Highly Cited Researchers, and The World’s Most Influential Scientific Minds: 2015 & 2014 Reports.

Jurisica Lab, IBM Life Sciences Discovery Center:

Canada Tier I Research Chair: https://www.chairs-chaires.gc.ca/chairholders-titulaires/profile-eng.aspx?profileId=2347

On Nutrigenomics [1]: https://www.uhn.ca/corporate/News/Pages/Igor_Jurisica_talks_nutrigenomics.aspx

[1] Nutrigenomics tries to define the causality or relationship between specific nutrients and specific nutrient regimes (diets) on human health. The underlying idea is in personalized nutrition based on the *omics background, which may help to foster personal dietrary recommendations. Ultimately, nutrigenomics will allow effective dietary-intervention strategies to recover normal homeostasis and to prevent diet-related diseases, see: Muller, M. & Kersten, S. 2003. Nutrigenomics: goals and strategies. Nature Reviews Genetics, 4, (4), 315-322.

Machine Learning Guide

The Machine Learing Guide by Tyler RENELLE (Tensor Flow, O-C-Devel) is highly recommendable to my students! This series aims to teach the high level fundamentals of machine learning with a focus on algorithms and some underlying mathematics, which is really great.

https://ocdevel.com/podcasts/machine-learning

 

 

 

CD-MAKE machine learning and knowledge extraction

Cross Domain Conference for Machine Learning & Knowledge Extraction

cd-make.net

Call for Papers – due to May, 15, 2017

https://www.wikicfp.com/cfp/servlet/event.showcfp?eventid=61244&copyownerid=17803

Call for Papers due to May, 15, 2017

International IFIP Cross Domain Conference for Machine Learning & Knowledge Extraction CD-MAKE
in Reggio di Calabria (Italy) August 29 – September 1, 2017

https://cd-make.net

CD stands for Cross-Domain and means the integration and appraisal of different fields and application domains (e.g. Health, Industry 4.0, etc.) to provide an atmosphere to foster different perspectives and opinions. The conference is dedicated to offer an international platform for novel ideas and a fresh look on the methodologies to put crazy ideas into Business for the benefit of the human. Serendipity is a desired effect, and shall cross-fertilize methodologies and transfer of algorithmic developments.

MAKE stands for MAchine Learning & Knowledge Extraction.

CD-MAKE is a joint effort of IFIP TC 5, IFIP WG 8.4, IFIP WG 8.9 and IFIP WG 12.9 and is held in conjunction with the International Conference on Availability, Reliability and Security (ARES).
Keynote Speakers are Neil D. LAWRENCE (Amazon) and Marta MILO (University of Sheffield).

IFIP is the International Federation for Information Processing and the leading multi-national, non-governmental, apolitical organization in Information & Communications Technologies and Computer Sciences, is recognized by the United Nations and was established in the year 1960 under the auspices of the UNESCO as an outcome of the first World Computer Congress held in Paris in 1959.

Papers are sought from the following seven topical areas (see image below). Papers which deal with fundamental questions and theoretical aspects in machine learning are very welcome.

❶ Data science (data fusion, preprocessing, data mapping, knowledge representation),
❷ Machine learning (both automatic ML and interactive ML with the human-in-the-loop),
❸ Graphs/network science (i.e. graph-based data mining),
❹ Topological data analysis (i.e. topology data mining),
❺ Time/entropy (i.e. entropy-based data mining),
❻ Data visualization (i.e. visual analytics), and last but not least
❼ Privacy, data protection, safety and security (i.e. privacy aware machine learning).

Proposals for Workshops, Special Sessions, Tutorials: April, 19, 2017
Submission Deadline: May, 15, 2017
Author Notification: June, 14, 2017
Camera Ready Deadline: July, 07, 2017

 

 https://cd-make.net/call-for-papers