Again machine learning made it to the title page of Science: A nice further proof for the importance of the human-in-the-loop by a paper of
Lake, B. M., Salakhutdinov, R. & Tenenbaum, J. B. 2015. Human-level concept learning through probabilistic program induction. Science, 350, (6266), 1332-1338.
Whilst humans can learn new concepts often from a very few examples, automated machine learning (aML) methods ususally need many examples (often called: big data) to perform with similar accuracy (and with the danger of modelling artefacts, e.g. through overfitting). The authors present a computational model which captures these human learning abilities for a large class of simple visual concepts: handwritten characters from the world’s alphabets. The model represents concepts as simple programs that best explain observed examples under a Bayesian criterion. Very interesting is the fact that on a challenging one-shot classification task, this model achieves human-level performance and outperforms recent deep learning approaches!
The authors also present several “visual Turing tests” probing the model’s creative generalization abilities, which in many cases are indistinguishable from human behavior – a must read at: http://www.sciencemag.org/content/350/6266/1332.full
Machine Learning for Health Informatics
Machine learning is a large and rapidly developing subfield of computer science that evolved from artificial intelligence (AI) and is tightly connected with data mining and knowledge discovery. The ultimate goal of machine learning is to design and develop algorithms which can learn from data. Consequently, machine learning systems learn and improve with experience over time and their trained models can be used to predict outcomes of questions based on previously seen knowledge. In fact, the process of learning intelligent behaviour from noisy examples is one of the major questions in the field. The ability to learn from noisy, high dimensional data is highly relevant for many applications in the health informatics domain. This is due to the inherent nature of biomedical data, and health will increasingly be the focus of machine learning research in the near future.
Title: Coordination of post-translational modifications in human protein interaction network
Lecturer: Ulrich Stelzl, Network Pharmacology, Insitute of Pharmaceutical Sciences, Karl-Franzens University Graz
Abstract: Comprehensive protein interaction networks are prerequisite for a better understanding of complex genotype to phenotype relationships. Post – translational modifications (PTMs) regulate protein activity, stability and protein interaction (PPI) profiles critical for cellular functioning. In combined experimental and computational approaches, we want to elucidate the role of post – translational protein modifications, such as phosphorylation, for these dynamic processes and investigate how the large number of changing PTMs is coordinated in cellular protein networks and likewise how PTMs may modulate protein – protein interaction networks. We identified hundreds of protein complexes that selectively accumulate different PTMs i.e. phosphorylation, acetylation and ubiquitination. Also protein regions of very high PTM densities, termed PTMi spots, were characterized and show domain – like features. The analysis of phosphorylation – dependent interactions provides clues on how these PPIs are dynamically and spatially constrained to separate simultaneously triggered growth signals which are often altered in oncogenic conditions. Our data indicate coordinated targeting of specific molecular functions via PTMs at different levels emphasizing a protein network approach as requisite to better understand modification impact on cellular signaling and cancer phenotypes.
Short bio: Ulrich Stelzl studied Chemistry/Biochemistry at the TU Vienna and ETH Zürich. His PhD thesis (MPIMG, Berlin) and first PostDoc (MSKCC, New York) addressed detailed biochemical questions of RNA-protein recognition, such as the assembly and dynamics of ribonucleo-protein complexes in gene expression and regulation. Then at the MDC Berlin, Ulrich Stelzl contributed significantly to well recognized protein-protein interaction (PPI) studies such as the generation and analysis of the first human proteome scale PPI networks or the development of an empirical framework for human interactome mapping. The importance of the work and its interdisciplinary character was recognized by the Erwin Schrödinger Price 2008 of the German Helmholtz Society. From 2007 on, Ulrich Stelzl headed the Max-Planck Research Group “Molecular Interaction Networks” at the MPIMG, Berlin and joined recently the Department of Pharmaceutical Sciences of the University of Graz.
We welcome Irina KUZNETSOVA to our group, who will do her PhD with us on the topic of machine learning for mitochondria research
Her inauguratioal talk is on
Mitochondrial diseases are progressive and debilitating multi-system disorders that occur at a frequency of up to 1 in 5,000 live births with no known cure. There is a variety of different complex mechanisms that cause the disruption of normal mitochondrial functions and leads to development of mitochondrial diseases. Identification of the molecular and pathophysiological mechanisms that cause mitochondrial disease remains challenging. However, establishing mouse models of mitochondrial disease would enable the study of the onset, progression and penetrance of mitochondrial disease as well as investigation of the tissues specifically affected in mitochondrial disease. Consequently this will enable to develop pre-clinical models of mitochondrial disease that could be used for testing a range of treatments for these diseases.
Irina did her Bachelor in computing sciences in St.Petersburg, and her Masters in Bioinformatics at the Tampere University of Technology in Finland. Curently she is working a the Mitochondrial Medicine and Biology laboratory at the University of Western Australia in Perth where she is co-supervised by Professor Aleksandra Filipovska.
The potential of metabolomics and its various data types
Lecturer: Natalie BORDAG, CBmed – Center for Biomarker Research in Medicine Graz
Abstract: Metabolomics is one of the youngest -omics technologies primarily concerned with the identification and quantification of small molecules (<1500 Da). The specific advantage of metabolomics in biomarker research lies in the concept, that metabolites fall downstream of genetic, transcriptomic, proteomic, microbiomic and environmental variation, thus providing the most integrated and dynamic measure of phenotype and medical condition. Thus metabolomics can deliver biologically most valuable results allowing for example early diagnostic biomarkers, optimization of biotechnological productions, gaining deep insights into pathological mechanism, identifying new therapeutic targets and many more. Metabolomics, especially MS (mass spectrometry) based metabolomics, delivers along a the flow from measurement towards knowledge generation highly divers data types with most potential yet to be exploited. The biological potential for knowledge generation by metabolomics will be shown with a real life example. The different data types and common data aggregation (e.g. peak detection, identification), transformations, statistical analysis and visualizations will be shown and open potentials jointly discussed.
Visual-Interactive Search and Exploration in Complex Data Repositories
– Feature-Based Search, Applications and Research Challenges
Lecturer: Tobias SCHRECK, University of Konstanz and Graz University of Technology <link>
Abstract: Advances in data acquisition and storage technology are leading to the creation of large, complex data sets in many different domains including science, engineering or social media. Often, this data is of non-textual / non-spatial nature. Important user tasks for leveraging large complex data sets include retrieval of relevant information, exploration for patterns and insights, and re-using data for authoring purposes. User-oriented, effective and scalable approaches are needed to support these tasks. Visual-interactive techniques in combination with automatic data analysis approaches can provide effective user interfaces for handling large, complex data sets, and help users to factor in background knowledge for solving search and analysis tasks. We will discuss approaches for visual-interactive, content-based search and analysis tasks in time-oriented and multivariate data sets, with applications in Digital Data Libraries. We will discuss how sketch-and example-based search interfaces allow to effectively formulate user queries, and how appropriate similarity functions for these data types can be defined and evaluated. We will also discuss approaches for visual-interactive search in 3D model repositories. Furthermore, we will present approaches for the repair of 3D models of deteriorated Cultural Heritage objects, relying on appropriate feature-based 3D similarity functions. We conclude this talk with a discussion of interesting research challenges at the intersection of visual data analysis, novel non-textual data types, and applications in Digital Libraries.
Lecun, Y., Bengio, Y. & Hinton, G. 2015. Deep learning. Nature, 521, (7553), 436-444.
Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
More information: http://www.nature.com/nature/journal/v521/n7553/full/nature14539.html
Nature Issue 7553 contains a special about computational intelligence!
Title: Towards Knowledge Discovery with the human in the machine learning loop: An Ontology-Guided Meta-Classifying Approach for the Biomedical Domain
Lecturer: Dominic GIRADI, RISC-Software Linz, Austria <expertise>
Abstract: The process of knowledge discovery in clinical research is significantly different from other business domains, for example market research. While in the general definitions of knowledge discovery the domain expert is in a rather consulting, supervising or customer-like role, the complex process of (bio-) medical or clinical knowledge discovery requires the medical domain expert to be deeply involved into this process. At the same time, data integration and data pre-processing are known to be major pitfalls to such (bio-) medical data projects, due to the fact that in the (bio-) medical domain we are confronted with extremely high complexity, heterogeneity, along with unprecedented amounts of data sets. In this lecture it will be discussed what consequences for the knowledge discovery process arise, when the domain expert is moved to a central position of this process, and as a consequence how advanced machine learning algorithms can be combined with traditional, ontology-centered approaches for the benefit of advancing (bio-)medical research. Examples are given of different medical research projects, i.e.: clinical benchmarking, cerebral aneurysm and biometric study of children and young adults.
The theoretical focus of this talk is on how the elaborate structural meta-information of the domain ontology can be used to parametrize and automatize advanced machine learning algorithms and data visualization methods. Two examples will be presented: An ontology-guided dimensionality reduction with focus on the hierarchical structured, multi-select categorical variables and an approach of an ontology-guided meta-classifier.
Title: Towards Personalization of Diabetes Therapy Using Computerized Decision Support and Machine Learning
Lecturer: Klaus DONSA <expertise> and Stephan SPAT <expertise>
Abstract: Diabetes mellitus (DM) is a growing global disease which highly affects the individual patient and represents a global health burden with financial impact on national health care systems. The therapeutic options include lifestyle changes such as change of diet and an increase of physical activity, but also administration of oral or injectable antidiabetic drugs. The diabetes therapy, especially with insulin, is complex. Therapy decisions include various medical and life-style related information. Computerized decision support systems (CDSS) aim to improve the treatment process in patient´s self-management but also in institutional care. Therefore, the personalization of the patient´s diabetes treatment is possible at different levels and is also facilitated by using new therapy aids like food and activity recognition systems, lifestyle support tools and pattern recognition for insulin therapy optimization. In this talk we discuss the role of machine learning in this context. Furthermore we provide insights in different strategies to personalize diabetes therapy and how CDSS can support the therapy process. During our work we found open problems and challenges for the personalization of diabetes therapy. In a final discussion we will address these open problems with focus on decision support systems and especially machine learning technology.