TY - JOUR
T1 - Measure transformer semantics for Bayesian machine learning
AU - Borgström, Johannes
AU - Gordon, Andrew D.
AU - Greenberg, Michael
AU - Margetson, James
AU - Van Gael, Jurgen
PY - 2013/9/9
Y1 - 2013/9/9
N2 - The Bayesian approach to machine learning amounts to computing posterior distributions of random variables from a probabilistic model of how the variables are related (that is, a prior distribution) and a set of observations of variables. There is a trend in machine learning towards expressing Bayesian models as probabilistic programs. As a foundation for this kind of programming, we propose a core functional calculus with primitives for sampling prior distributions and observing variables. We define measure-transformer combinators inspired by theorems in measure theory, and use these to give a rigorous semantics to our core calculus. The original features of our semantics include its support for discrete, continuous, and hybrid measures, and, in particular, for observations of zero-probability events. We compile our core language to a small imperative language that is processed by an existing inference engine for factor graphs, which are data structures that enable many efficient inference algorithms. This allows efficient approximate inference of posterior marginal distributions, treating thousands of observations per second for large instances of realistic models.
AB - The Bayesian approach to machine learning amounts to computing posterior distributions of random variables from a probabilistic model of how the variables are related (that is, a prior distribution) and a set of observations of variables. There is a trend in machine learning towards expressing Bayesian models as probabilistic programs. As a foundation for this kind of programming, we propose a core functional calculus with primitives for sampling prior distributions and observing variables. We define measure-transformer combinators inspired by theorems in measure theory, and use these to give a rigorous semantics to our core calculus. The original features of our semantics include its support for discrete, continuous, and hybrid measures, and, in particular, for observations of zero-probability events. We compile our core language to a small imperative language that is processed by an existing inference engine for factor graphs, which are data structures that enable many efficient inference algorithms. This allows efficient approximate inference of posterior marginal distributions, treating thousands of observations per second for large instances of realistic models.
KW - Denotational semantics
KW - Model-based machine learning
KW - Probabilistic programming
KW - Programming languages
UR - http://www.scopus.com/inward/record.url?scp=84884872353&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84884872353&partnerID=8YFLogxK
U2 - 10.2168/LMCS-9(3:11)2013
DO - 10.2168/LMCS-9(3:11)2013
M3 - Article
AN - SCOPUS:84884872353
VL - 9
JO - Logical Methods in Computer Science
JF - Logical Methods in Computer Science
IS - 3
M1 - 11
ER -