LV 706.315 Selected Topics on interactive Knowledge Discovery:
interactive Machine Learning (iML) – Winter term 2015 (again in WS 2016)

LV-706315-interactive-machine-learning-holzingerActive-Machine-Learning-Oracle

Motivation for this lecture:

Whilst automated Machine Learning (aML) approaches (“Google car“) work well in many domains, particularly with big data sets, in complex domains with small training data sets the application of aML entails the danger of modelling artefacts.  An example for a complex domain is Biomedicine, where experts are often confronted with high-dimensional, probabilistic, incomplete and often small data sets,  which makes the application of aML-approaches difficult. In such situations it can be advantageous to integrate human domain knowledge and expertise into the machine learning loop. The foundations of iML-approaches can be found in reinforcement learning, preference learning and active learning.

Definition of interactive Machine Learning (iML):

We define iML-approaches as algorithms that can interact with both computational and human agents *) and can optimize its learning behaviour trough this interaction.

*) Such agents are called in Active Learning “oracles” (see e.g.: Settles, B. 2011. From theories to queries: Active learning in practice. In: Guyon, I., Cawley, G., Dror, G., Lemaire, V. & Statnikov, A. (eds.) Active Learning and Experimental Design Workshop 2010. Sardinia: JMLR Proceedings. 1-18.

Goal of this lecture:

This graduate course follows a research-based teaching (RBT) approach and provides a broad overview of models and discusses methods for combining human intelligence with machine intelligence to solve computational problems. The application focus is on the biomedical domain. The cross-sectional topic is evaluation, because the all-in-one method suitable for every purpose (in German: eierlegende Wollmilchsau) would be nice but is not achievable. Consequently, an evaluation framework is of extreme importance for driving progress in any machine learning approach. For practical applications we use the JULIA language – besides of R and Python.

Background:

A classic challenge in machine learning is in the development of a model, which relates observed data X ܺto categorical variables Y  ∈ {x1, x2, x3, … } to infer higher level information from the data.  This problem is widely applicable as X may represent any data and Y any information. In domains dealing with uncertainties (such as the biomedical domain), we seek applications where Y includes unknowns. Machine learning solutions to this problem must involve human expertise during the design phase to provide relevant training data sets. Unfortunately, to date such experts are not part of machine learning algorithms, on the contrary, the grand goal of artificial intelligence is to exclude the human from the loop, hence make it fully automatic (see “Google car”). The objective of interactive machine learning methods is to develop algorithms which can interact with both: computational agents and human agents – towards hybrid multi-agent systems.

General information:

Machine learning is a large subfield of computer science that evolved from artificial intelligence (AI) and is tightly connected with data mining and knowledge discovery. The grand goal is to design and develop algorithms which can learn from data. Consequently, machine learning systems learn and improve with experience and time and can be used to predict outcomes of questions based on previous knowledge. Actually, to learn intelligent behaviour from noisy examples is a grand exciting challenge in AI. This is highly relevant for many applications in biomedical informatics generally, and for stratified and personalized medicine in particular.

Target Group:

Research students of Computer Science who are interested in knowledge discovery/data mining by following the idea of iML-approaches, i.e. human-in-the-loop learning systems. This is a cross-disciplinary computer science topic and highly relevant for the application in complex domains, such as biomedicine.

Keywords:

Interactive Machine Learning, Reinforcement Learning, Preference Learning, Active Learning, Computational Intelligence

Schedule of LV 706.315 Interactive Machine Learning Winter term 2015/16
READING BEFORE CLASSWEEK – DATETIMEAGENDASLIDES as pdf
Week 42
Friday, 16.10.2015
10:00 – 11:30Lecture 00 First Meeting Lecture Preliminaries and Organizational Issues
Holzinger, A. 2016. Interactive Machine Learning for Health Informatics: When do we need the human-in-the-loop? Springer Brain Informatics (BRIN), 3, (2), 119-131, doi:10.1007/s40708-016-0042-6.Week 43
Friday, 23.10.2015
10:00 – 11:30Lecture 01
INTRODUCTION: automatic Machine Learning (aML) versus interactive Machine Learning (iML)
PDF Icon
Chater, N., Tenenbaum, J. B. & Yuille, A. 2006. Probabilistic models of cognition: Conceptual foundations. Trends in cognitive sciences, 10, (7), 287-291, doi:10.1016/j.tics.2006.05.007.Week 44
Friday, 30.10.2015
10:00 – 11:30Lecture 02
HUMAN LEARNING versus MACHINE LEARNING
PDF Icon
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S. & Hassabis, D. 2015. Human-level control through deep reinforcement learning. Nature, 518, (7540), 529-533, doi:10.1038/nature14236Week 45
Friday, 06.11.2015
10:00 – 11:30Lecture 03
REINFORCEMENT LEARNING (RL): Markov Decision Processes, Evolutionary Computation for RL, Bayesian RL, Game Theory and Multi-Agent RL
PDF Icon
Fürnkranz, J. & Hüllermeier, E. 2010. Preference learning: An introduction. Preference learning. Springer, pp. 1-17.
http://drops.dagstuhl.de/opus/volltexte/2014/4550/pdf/dagrep_v004_i003_p001_s14101.pdf
Week 46
Friday, 13.11.2015
10:00 – 11:30Lecture 04
PREFERENCE LEARNING (PL): Label Ranking, Instance Ranking, Object Ranking; Evaluation of Search Engines, Learning SVM Ranking from User Feedback; Collaborative PL
Week 47
Friday, 20.11.2015
10:00 – 11:30Lecture 05
ACTIVE LEARNING and Active Preference Learning (APL); Measures of Uncertainty, Searching through hypothesis space, exploiting structure in data.
Multi-task feature learning
A Evgeniou, M Pontil – Advances in neural information processing systems, 2007
Cited by 679 (21.7.2016)
Week 48
Friday, 27.11.2015
10:00 – 11:30Lecture 06 MULTI-TASK-LEARNING (MTL): MTL with clustered structures, MTL with tree structures, MTL with graph structures, robust MT feature learning; examples from disease progression (Alzheimer Disease)
Note: TL is not easy: learning to perform a task by exploiting knowledge acquired when solving previous tasks:
a solution to this problem would have major impact to AI research generally and ML specifically! So this is a grand challenge for our research!
Week 49
Friday, 04.12. 2015
10:00 – 11:30Lecture 07
TRANSFER LEARNING (TL): catastrophic forgetting, domain adaptation, adaptive learning, inductive transfer
Week 50
Friday, 11.12. 2015
10:00 – 11:30Lecture 08
MULTI-AGENT-HYBRID-SYSTEMS (MAHS), multi-agent systems (MAS), passive agents, active agents (bird flocks, wolf-sheep, prey-predator models), cognitive agents, distributed contraint optimization (DCOPs)
Week 51
Friday, 18.12.2015
10:00 – 11:30Exam

Some Quick Explanations:

Active Learning (AL) := to select training samples to enable a minimization of loss in future cases; a learner must take actions to gain information, and has to decide which actions will provide the information that will optimally minimize future loss. The basic idea goes back to Fedorov, V. (1972). Theory of optimal experiments. New York: Academic Press. According to Sanjoy Dasgupta the frontier of active learning is mostly unexplored, and except for a few specic cases, we do not have a clear sense of how much active learning can reduce label complexity: whether by just a constant factor, or polynomially, or exponentially. The fundamental statistical and algorithmic challenges involved along with huge practical application possibility make AL a very important area for future research.

Interactive Machine Learning (iML) := machine learning algorithms which can interact with – partly human – agents and can optimize its learning behaviour trough this interaction. Holzinger, A. 2016. Interactive Machine Learning for Health Informatics: When do we need the human-in-the-loop? Brain Informatics (BRIN), 3, (2), 119-131.

Preference learning (PL) := concerns problems in learning to rank, i.e. learning a predictive preference model from observed preference information, e.g. with label ranking, instance ranking, or object ranking.  Fürnkranz, J., Hüllermeier, E., Cheng, W. & Park, S.-H. 2012. Preference-based reinforcement learning: a formal framework and a policy iteration algorithm. Machine Learning, 89, (1-2), 123-156.

Reinforcement Learning (RL) := examination on how an agent may learn from a series of reinforcements (sucess/rewards or failure/punishments). A must read is Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A survey. Journal of artificial intelligence research, 237-285.

Multi-Agent Systems (MAS) := include collections of several independent agents, could also be a mixture of computer agents and human agents. An exellent pointer of the later one is: Jennings, N. R., Moreau, L., Nicholson, D., Ramchurn, S. D., Roberts, S., Rodden, T. & Rogers, A. 2014. On human-agent collectives. Communications of the ACM, 80-88.

Transfer Learning (TL) := The ability of an algorithm to recognize and apply knowledge and skills learned in previous tasks to
novel tasks or new domains, which share some commonality. Central question: Given a target task, how do we identify the
commonality between the task and previous tasks, and transfer the knowledge from the previous tasks to the target one?
Pan, S. J. & Yang, Q. 2010. A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering, 22, (10), 1345-1359, doi:10.1109/tkde.2009.191.

Pointers:

A) My students with a GENERAL interest in machine learning should definitely browse these sources:

1) TALKING MACHINES – Human conversation about machine learning by Katherine GORMAN and Ryan P. ADAMS <expertise>
excellent audio material – 24 episodes up to 22.11.2015

2) VIDEOLECTURES.NET Machine learning talks (3,106 items up to 4.7.2015)

3) TUTORIALS ON TOPICS IN MACHINE LEARNING by Bob Fisher from the University of Edinburgh, UK

B) My students with a PARTICULAR interest in interactive machine learning should browse these sources:

1) Theory, Methods and Applications of Active Learning,  by Robert D. NOWAK, University of Wisconsin <expertise>, MLSS 2009

2) Active Learning Tutorial by Sanjoy DASGUPTA & John LANGFORD, ICML 2009

3) Nonparametric Active Learning, Robert D. NOWAK, NIPS Workshops 2013

Reading List:

1) Articles

Cohn, D. A., Ghahramani, Z. & Jordan, M. I. (1996). Active learning with statistical models. Journal of Artificial Intelligence Research, 4, 129-145.

Dasgupta, S. (2011). Two faces of active learning. Theoretical computer science, 412, (19), 1767-1781.

Holzinger, A. (2016). Interactive Machine Learning for Health Informatics: When do we need the human-in-the-loop? Springer Brain Informatics (BRIN), 3, (1), in print.

Holzinger, A. (2016). Interactive Machine Learning (iML). Informatik Spektrum, 39, (1), in print.

Holzinger, A. (2014). Trends in Interactive Knowledge Discovery for Personalized Medicine: Cognitive Science meets Machine Learning. Intelligent Informatics Bulletin, 15(1): 6-14.

Holzinger, A. (2014). Extravaganza Tutorial on Hot Ideas for Interactive Knowledge Discovery and Data Mining in Biomedical Informatics. In: Slezak, D., Tan, A.-H., Peters, J. F. & Schwabe, L. (eds.) Brain Informatics and Health, BIH 2014, Lecture Notes in Artificial Intelligence, LNAI 8609. Heidelberg, Berlin: Springer, pp. 502-515.

Porter, R., Hush, D., Harvey, N. & Theiler, J. (2010). Toward interactive search in remote sensing imagery. Proceedings of SPIE Cyber Security, Situation Management, and Impact Assessment II. International Society for Optics and Photonics. 77090V.

2) Books

Links:

Association for the Advancement of Artificial Intelligence AAAI > AI Magazine

Machine Learning News Group

  • [1] N. Spinrad, “Google car takes the test“, Nature, vol. 514, iss. 7523, pp. 528-528, 2014.
    [BibTeX] [DOI] [Download PDF]
    @article{Spinrad2014GoogleCar,
       year = {2014},
       author = {Spinrad, Norman},
       title = {Google car takes the test},
       journal = {Nature},
       volume = {514},
       number = {7523},
       pages = {528-528},
       doi = {10.1038/514528a},
       url = {http://dx.doi.org/10.1038/514528a}
    }

  • [2] A. Holzinger, Biomedical Informatics: Discovering Knowledge in Big Data, New York: Springer, 2014.
    [BibTeX] [Abstract] [DOI] [Download PDF]

    This book provides a broad overview of the topic Bioinformatics (medical informatics + biological information) with a focus on data, information and knowledge. From data acquisition and storage to visualization, privacy, regulatory, and other practical and theoretical topics, the author touches on several fundamental aspects of the innovative interface between the medical and computational domains that form biomedical informatics. Each chapter starts by providing a useful inventory of definitions and commonly used acronyms for each topic, and throughout the text, the reader finds several real-world examples, methodologies, and ideas that complement the technical and theoretical background. Also at the beginning of each chapter a new section called key problems, has been added, where the author discusses possible traps and unsolvable or major problems. This new edition includes new sections at the end of each chapter, called future outlook and research avenues, providing pointers to future challenges.

    @book{Holzinger2014SpringerTextbook,
       year = {2014},
       author = {Holzinger, Andreas},
       title = {Biomedical Informatics: Discovering Knowledge in Big Data},
       publisher = {Springer},
       address = {New York},
       abstract = {This book provides a broad overview of the topic Bioinformatics (medical informatics + biological information) with a focus on data, information and knowledge. From data acquisition and storage to visualization, privacy, regulatory, and other practical and theoretical topics, the author touches on several fundamental aspects of the innovative interface between the medical and computational domains that form biomedical informatics. Each chapter starts by providing a useful inventory of definitions and commonly used acronyms for each topic, and throughout the text, the reader finds several real-world examples, methodologies, and ideas that complement the technical and theoretical background. Also at the beginning of each chapter a new section called key problems, has been added, where the author discusses possible traps and unsolvable or major problems. This new edition includes new sections at the end of each chapter, called future outlook and research avenues, providing pointers to future challenges.},
       doi = {10.1007/978-3-319-04528-3},
       url = {https://online.tugraz.at/tug_online/voe_main2.getVollText?pDocumentNr=974906&pCurrPk=78579}
    }

  • [3] A. Holzinger, “Trends in Interactive Knowledge Discovery for Personalized Medicine: Cognitive Science meets Machine Learning“, IEEE Intelligent Informatics Bulletin, vol. 15, iss. 1, pp. 6-14, 2014.
    [BibTeX] [Abstract] [Download PDF]

    A grand goal of future medicine is in modelling the complexity of patients to tailor medical decisions, health practices and therapies to the individual patient. This trend towards personalized medicine produces unprecedented amounts of data, and even though the fact that human experts are excellent at pattern recognition in dimensions of smaller than three, the problem is that most biomedical data is in dimensions much higher than three, making manual analysis difficult and often impossible. Experts in daily medical routine are decreasingly capable of dealing with the complexity of such data. Moreover, they are not interested the data, they need knowledge and insight in order to support their work. Consequently, a big trend in computer science is to provide efficient, useable and useful computational methods, algorithms and tools to discover knowledge and to interactively gain insight into high-dimensional data. A synergistic combination of methodologies of two areas may be of great help here: Human–Computer Interaction (HCI) and Knowledge Discovery/Data Mining (KDD), with the goal of supporting human intelligence with machine learning. A trend in both disciplines is the acquisition and adaptation of representations that support efficient learning. Mapping higher dimensional data into lower dimensions is a major task in HCI, and a concerted effort of computational methods including recent advances from graphtheory and algebraic topology may contribute to finding solutions. Moreover, much biomedical data is sparse, noisy and timedependent, hence entropy is also amongst promising topics. This paper provides a rough overview of the HCI-KDD approach and focuses on three future trends: graph-based mining, topological data mining and entropy-based data mining.[interactive machine learning]

    @article{Holzinger2014trends,
       year = {2014},
       author = {Holzinger, Andreas},
       title = {Trends in Interactive Knowledge Discovery for Personalized Medicine: Cognitive Science meets Machine Learning},
       journal = {IEEE Intelligent Informatics Bulletin},
       volume = {15},
       number = {1},
       pages = {6-14},
       abstract = {A grand goal of future medicine is in modelling the complexity of patients to tailor medical decisions, health practices and therapies to the individual patient. This trend towards personalized medicine produces unprecedented amounts of data, and even though the fact that human experts are excellent at pattern recognition in dimensions of smaller than three, the problem is that most biomedical data is in dimensions much higher than three, making manual analysis difficult and often impossible. Experts in daily medical routine are decreasingly capable of dealing with the complexity of such data. Moreover, they are not interested the data, they need knowledge and insight in order to support their work. Consequently, a big trend in computer science is to provide efficient, useable and useful computational methods, algorithms and tools to discover knowledge and to interactively gain insight into high-dimensional data. A synergistic combination of methodologies of two areas may be of great help here: Human–Computer Interaction (HCI) and Knowledge Discovery/Data Mining (KDD), with the goal of supporting human intelligence with machine learning. A trend in both disciplines is the acquisition and adaptation of representations that support efficient learning. Mapping higher dimensional data into lower dimensions is a major task in HCI, and a concerted effort of computational methods including recent advances from graphtheory and algebraic topology may contribute to finding solutions. Moreover, much biomedical data is sparse, noisy and timedependent, hence entropy is also amongst promising topics. This paper provides a rough overview of the HCI-KDD approach and focuses on three future trends: graph-based mining, topological data mining and entropy-based data mining.[interactive machine learning]},
       url = {http://www.comp.hkbu.edu.hk/~cib/2014/Dec/article2/iib_vol15no1_article2.pdf}
    }

  • [4] A. Holzinger, “Human–Computer Interaction and Knowledge Discovery (HCI-KDD): What is the benefit of bringing those two fields to work together?“, in Multidisciplinary Research and Practice for Information Systems, Springer Lecture Notes in Computer Science LNCS 8127, A. Cuzzocrea, C. Kittl, D. E. Simos, E. Weippl, and L. Xu, Eds., Heidelberg, Berlin, New York: Springer, 2013, pp. 319-328.
    [BibTeX] [Abstract] [DOI] [Download PDF]

    A major challenge in our networked world is the increasing amount of data, which require efficient and user-friendly solutions. A timely example is the biomedical domain: the trend towards personalized medicine has resulted in a sheer mass of the generated (-omics) data. In the life sciences domain, most data models are characterized by complexity, which makes manual analysis very time-consuming and frequently practically impossible. Computational methods may help; however, we must acknowledge that the problem-solving knowledge is located in the human mind and – not in machines. A strategic aim to find solutions for data intensive problems could lay in the combination of two areas, which bring ideal pre-conditions: Human–Computer Interaction (HCI) and Knowledge Discovery (KDD). HCI deals with questions of human perception, cognition, intelligence, decision-making and interactive techniques of visualization, so it centers mainly on supervised methods. KDD deals mainly with questions of machine intelligence and data mining, in particular with the development of scalable algorithms for finding previously unknown relationships in data, thus centers on automatic computational methods. A proverb attributed perhaps incorrectly to Albert Einstein illustrates this perfectly: “Computers are incredibly fast, accurate, but stupid. Humans are incredibly slow, inaccurate, but brilliant. Together they may be powerful beyond imagination”. Consequently, a novel approach is to combine HCI & KDD in order to enhance human intelligence by computational intelligence.

    @incollection{Holzinger2013HCI-KDD,
       year = {2013},
       author = {Holzinger, Andreas},
       title = {Human–Computer Interaction and Knowledge Discovery (HCI-KDD): What is the benefit of bringing those two fields to work together?},
       booktitle = {Multidisciplinary Research and Practice for Information Systems, Springer Lecture Notes in Computer Science LNCS 8127},
       editor = {Cuzzocrea, Alfredo and Kittl, Christian and Simos, Dimitris E. and Weippl, Edgar and Xu, Lida},
       publisher = {Springer},
       address = {Heidelberg, Berlin, New York},
       pages = {319-328},
       abstract = {A major challenge in our networked world is the increasing amount of data, which require efficient and user-friendly solutions. A timely example is the biomedical domain: the trend towards personalized medicine has resulted in a sheer mass of the generated (-omics) data. In the life sciences domain, most data models are characterized by complexity, which makes manual analysis very time-consuming and frequently practically impossible. Computational methods may help; however, we must acknowledge that the problem-solving knowledge is located in the human mind and – not in machines. A strategic aim to find solutions for data intensive problems could lay in the combination of two areas, which bring ideal pre-conditions: Human–Computer Interaction (HCI) and Knowledge Discovery (KDD). HCI deals with questions of human perception, cognition, intelligence, decision-making and interactive techniques of visualization, so it centers mainly on supervised methods. KDD deals mainly with questions of machine intelligence and data mining, in particular with the development of scalable algorithms for finding previously unknown relationships in data, thus centers on automatic computational methods. A proverb attributed perhaps incorrectly to Albert Einstein illustrates this perfectly: “Computers are incredibly fast, accurate, but stupid. Humans are incredibly slow, inaccurate, but brilliant. Together they may be powerful beyond imagination”. Consequently, a novel approach is to combine HCI & KDD in order to enhance human intelligence by computational intelligence. },
       keywords = {Human-Computer Interaction (HCI), Knowledge Discovery in Data (KDD), HCI-KDD, E-Science, Interdisciplinary, Intersection science},
       doi = {10.1007/978-3-642-40511-2_22},
       url = {https://online.tugraz.at/tug_online/voe_main2.getVollText?pDocumentNr=382991&pCurrPk=72064}
    }

  • [5] A. Holzinger, M. Dehmer, and I. Jurisica, “Knowledge Discovery and interactive Data Mining in Bioinformatics – State-of-the-Art, future challenges and research directions“, BMC Bioinformatics, vol. 15, iss. S6, p. I1, 2014.
    [BibTeX] [Abstract] [DOI] [Download PDF]

    The life sciences, biomedicine and health care are increasingly turning into a data intensive science. Particularly in bioinformatics and computational biology we face not only increased volume and a diversity of highly complex, multi-dimensional and often weakly-structured and noisy data, but also the growing need for integrative analysis and modeling. Due to the increasing trend towards personalized and precision medicine (P4 medicine: Predictive, Preventive, Participatory, Personalized), biomedical data today results from various sources in different structural dimensions, ranging from the microscopic world, and in particular from the omics world (e.g., from genomics, proteomics, metabolomics, lipidomics, transcriptomics, epigenetics, microbiomics, fluxomics, phenomics, etc.) to the macroscopic world (e.g., disease spreading data of populations in public health informatics). The challenge is not only to extract meaningful information from this data, but to gain knowledge, to discover previously unknown insight, look for patterns, and to make sense of the data.

    @article{HolzingerDehmerJurisica2014KDDBMCBioinfo,
       year = {2014},
       author = {Holzinger, Andreas and Dehmer, Matthias and Jurisica, Igor},
       title = {Knowledge Discovery and interactive Data Mining in Bioinformatics - State-of-the-Art, future challenges and research directions},
       journal = {BMC Bioinformatics},
       volume = {15},
       number = {S6},
       pages = {I1},
       abstract = {The life sciences, biomedicine and health care are increasingly turning into a data intensive science. Particularly in bioinformatics and computational biology we face not only increased volume and a diversity of highly complex, multi-dimensional and often weakly-structured and noisy data, but also the growing need for integrative analysis and modeling. Due to the increasing trend towards personalized and precision medicine (P4 medicine: Predictive, Preventive, Participatory, Personalized), biomedical data today results from various sources in different structural dimensions, ranging from the microscopic world, and in particular from the omics world (e.g., from genomics, proteomics, metabolomics, lipidomics, transcriptomics, epigenetics, microbiomics, fluxomics, phenomics, etc.) to the macroscopic world (e.g., disease spreading data of populations in public health informatics). The challenge is not only to extract meaningful information from this data, but to gain knowledge, to discover previously unknown insight, look for patterns, and to make sense of the data. },
       keywords = {Knowledge Discovery, Interactive Data Mining, Bioinformatics, Biomedical Informatics, Data intensive Science},
       doi = {doi:10.1186/1471-2105-15-S6-I1},
       url = {http://www.biomedcentral.com/1471-2105/15/S6/I1}
    }

  • [6] T. M. Mitchell, Machine learning, New York et al.: McGraw Hill, 1997.
    [BibTeX] [Abstract]

    The field of machine learning is concerned with the question of how to construct computer programs that automatically improve with experience. In recent years many successful machine learning applications have been developed, ranging from data-mining programs that learn to detect fraudulent credit card transactions, to information-filtering systems that learn users’ reading preferences, to autonomous vehicles that learn to drive on public highways. At the same time, there have been important advances in the theory and algorithms that form the foundations of this field.

    @book{Mitchell1997MachineLearning,
       year = {1997},
       author = {Mitchell, Tom M},
       title = {Machine learning},
       publisher = {McGraw Hill},
       address = {New York et al.},
       abstract = {The field of machine learning is concerned with the question of how to construct computer programs that automatically improve with experience. In recent  years many successful machine learning applications have been developed, ranging from data-mining programs that learn to detect fraudulent credit card transactions, to information-filtering systems that learn users'  reading preferences, to autonomous vehicles that learn to drive on public highways. At the same time, there have been important advances in the theory and algorithms that form the foundations of this field.}
    }

Time Line of relevant events for interactive Machine Learning (iML):

1950 Reinforcement Learning: Alan Turing (1912-1954) discusses RL within his paper on “Computing Machinery and Intelligence” in Oxford MIND, Volume 59, Issue 236, October 1950, pp. 433-460 doi:10.1093/mind/LIX.236.433 [link to pdf]

2000 Utility Theory:

Glossary (incomplete)

Dimension = n attributes which jointly describe a property.

Features = any measurements, attributes or traits representing the data. Features are key for learning and understanding.

Reals = numbers expressible as finite/infinite decimals

Regression = predicting the value of a random variable y from a measurement x.

Reinforcement learning = adaptive control, i.e. to learn how to (re-)act in a given environment, given delayed/ nondeterministic rewards.  Human learning is mostly reinforcement learning.

Historic People (incomplete)

Bayes, Thomas (1702-1761) gave a straightforward definition of probability [Wikipedia]

Laplace, Pierre-Simon, Marquis de (1749-1827) developed the Bayesian interpretation of probability [Wikipedia]

Price, Richard (1723-1791) edited and commented the work of Thomas Bayes in 1763 [Wikipedia]

Tukey, John Wilder (1915-2000) suggested in 1962 together with Frederick Mosteller the name “data analysis” for computational statistical sciences, which became much later the name data science [Wikipedia]

Antonyms (incomplete)

big data sets < > small data sets

correlation < > causality

discriminative < > generative

Frequentist < > Bayesian

low dimensional < > high dimensional

underfitting < > overfitting

parametric < > non-parametric

supervised < > unsupervised