Microsoft Research, Cambridge, U.K.

.................................................................... Published as: “Bayesian inference: An introduction to Principles and practice in machine learning.” In O. Bousquet, U. von Luxburg, and G. R¨tsch (Eds.), Advanced Lectures on a Machine Learning, pp. 41–62. Springer. 2004 June 26, 2006 http://www.miketipping.com/papers.htm mail@miketipping.com

Year of publication: This version typeset: Available from: Correspondence:

Abstract

This article gives a basic introduction to the principles of Bayesian inference in a machine learning context, with an emphasis on the importance of marginalisation for dealing with uncertainty. We begin by illustrating concepts via a simple regression task before relating ideas to practical, contemporary, techniques with a description of ‘sparse Bayesian’ models and the ‘relevance vector machine’.

1

Introduction

What is meant by “Bayesian inference” in the context of machine learning? To assist in answering that question, let’s start by proposing a conceptual task: we wish to learn, from some given number of example instances of them, a model of the relationship between pairs of variables A and B. Indeed, many machine learning problems are of the type “given A, what is B?”.1 Verbalising what we typically treat as a mathematical task raises an interesting question in itself. How do we answer “what is B?”? Within the appealingly well-deﬁned and axiomatic framework of propositional logic, we ‘answer’ the question with complete certainty, but this logic is clearly too rigid to cope with the realities of real-world modelling, where uncertaintly over ‘truth’ is ubiquitous. Our measurements of both the dependent (B) and independent (A) variables are inherently noisy and inexact, and the relationships between the two are invariably non-deterministic. This is where...

(1)