Stan: A probabilistic programming language

A long time ago submitted paper from the Stan developers
has finally been appeared at the Journal of statistical software:

Carpenter, B., Gelman, A., Hoffman, M., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M. A., Guo, J., Li, P. & Riddell, A. 2017. Stan: A probabilistic programming language. Journal of Statistical Software, 76, (1), 1-32, doi:10.18637/jss.v076.i01

Also the Python package can be downloaded from the site!

Stan is a probabilistic programming language for specifying statistical models. A Stan program imperatively defines a log probability function over parameters conditioned on specified data and constants. Stan provides full Bayesian inference
for continuous-variable models through Markov chain Monte Carlo methods such as the No-U-Turn sampler, an adaptive form of Hamiltonian Monte Carlo sampling. Penalized maximum likelihood estimates are calculated using optimization methods such as the limited memory Broyden-Fletcher-Goldfarb-Shanno algorithm. Stan is also a platform for computing log densities and their gradients and Hessians, which can be used in alternative algorithms such as variational Bayes, expectation propagation, and marginal inference using approximate integration. To this end, Stan is set up so that the densities, gradients, and Hessians, along with intermediate quantities of the algorithm such as acceptance probabilities, are easily accessible. Stan can be called from the command line using the cmdstan package, through R using the rstan package, and through Python using the pystan package. All three interfaces support sampling and optimization-based inference with diagnostics and posterior analysis. rstan and pystan also provide access to log probabilities, gradients, Hessians, parameter transforms, and specialized plotting.

Congratulations from the Holzinger Group to the authors!

machine learning for health informatics

LNAI 9605 Machine Learning for Health Informatics available

NEW – just appeared – NEW

Holzinger, A. (ed.) 2016. Machine Learning for Health Informatics: State-of-the-Art and Future Challenges. Cham: Springer International Publishing, doi:10.1007/978-3-319-50478-0

[book homepage]

Machine learning (ML) is the fastest growing field in computer science, and Health Informatics (HI) is amongst the greatest application challenges, providing future benefits in improved medical diagnoses, disease analyses, and pharmaceutical development. However, successful ML for HI needs a concerted effort, fostering integrative research between experts ranging from diverse disciplines from data science to visualization.

Tackling complex challenges needs both disciplinary excellence and cross-disciplinary networking without any boundaries. Following the HCI-KDD approach, in combining the best of two worlds, it is aimed to support human intelligence with machine intelligence.

This state-of-the-art survey is an output of the international HCI-KDD expert network and features 22 carefully selected and peer-reviewed chapters on hot topics in machine learning for health informatics; they discuss open problems and future challenges in order to stimulate further research and international progress in this field.

Neural Information Processing Systems

Holzinger Group at NIPS

Our crazy iML-Concept has been accepted at the CiML 2016 workshop (organized by Isabelle Guyon, Evelyne Viegas, Sergio Escalera, Ben Hammer & Balazs Kegl) at NIPS 2016 (December, 5-10, 2016)  in Barcelona:


Interactive machine learning for health informatics: when do we need the human-in-the-loop?

Machine learning (ML) is the fastest growing field in computer science, and health informatics is among the greatest challenges. The goal of ML is to develop algorithms which can learn and improve over time and can be used for predictions. Most ML researchers concentrate on automatic machine learning (aML), where great advances have been made, for example, in speech recognition, recommender systems, or autonomous vehicles. Automatic approaches greatly benefit from big data with many training sets. However, in the health domain, sometimes we are confronted with a small number of data sets or rare events, where aML-approaches suffer of insufficient training samples. Here interactive machine learning (iML) may be of help, having its roots in reinforcement learning, preference learning, and active learning. The term iML is not yet well used, so we define it as “algorithms that can interact with agents and can optimize their learning behavior through these interactions, where the agents can also be human.” This “human-in-the-loop” can be beneficial in solving computationally hard problems, e.g., subspace clustering, protein folding, or k-anonymization of health data, where human expertise can help to reduce an exponential search space through heuristic selection of samples. Therefore, what would otherwise be an NP-hard problem, reduces greatly in complexity through the input and the assistance of a human agent involved in the learning phase.

We define iML-approaches as algorithms that can interact with both computational agents and human agents *) and can optimize their learning behavior through these interactions.

*) In active learning such agents are referred to as the so-called “oracles”

From black-box to glass-box: where is the human-in-the-loop?

The first question we have to answer is: “What is the difference between the iML-approach to the aML-approach, i.e., unsupervised learning, supervised, or semi-supervised learning?”

Scenario D – see slide below – shows the iML-approach, where the human expert is seen as an agent directly involved in the actual learning phase, step-by-step influencing measures such as distance, cost functions, etc.

Obvious concerns may emerge immediately and one can argue: what about the robustness of this approach, the subjectivity, the transfer of the (human) agents; many questions remain open and are subject for future research, particularly in evaluation, replicability, robustness, etc.

Human-in-the-loop - Interactive Machine Learning

The iML-approach

Read full article here: