NeuroCOLT

Neural Networks and Computational Learning Theory

 

About NeuroCOLT

Papers Archive

1994 1995
1996 1997
1998 1999
2000 2001

Books

info@neurocolt.org

NeuroCOLT Technical Report NC-TR-97-032

On Predictive Distributions and Bayesian Networks

Petri Kontkanen, Petri Myllymaki, Tom Silander and Henry Tirri
University of Helsinki
Finland

Peter Grunwald
CWI, The Netherlands

Abstract
In this paper we are interested in discrete prediction problems for a decision-theoretic setting, where the task is to compute the predictive distribution for a finite set of possible alternatives. This question is first addressed in a general framework, where we consider a set of probability distributions defined by some parametric model class. The standard Bayesian approach is to compute the posterior probability for the model parameters, given a prior distribution and sample data, and fix the parameters to the instantiation with the maximum a posteriori probability. A more accurate predictive distribution can be obtained by comupting the evidence, i.e., the integral over all the individual parameter instantiations. As an alternative to these two approaches, we demonstrate how to use Rissanen's new definition of stochastic complexity for determining predictive distributions. We then describe how these predictive inference methods can be realized in the case of Bayesian networks. In particular, we demonstrate the use of Jeffrey's prior as the prior distribution for computing the evidence predictive distribution. It can be shown that the evidence predictive distribution with Jeffrey's prior approaches the new stochastic complexity predictive distribution in the limit with increasing amount of sample data. For computational reasons in the experimental part of the paper the three predictive distributions are compared by using the tree-structures simple Naive Bayes model. The experimentation with several public domain classification datasets suggest that the evidence approach produces the most accurate predictions in the log-score sense, especially with small training sets.

Download Compressed Postscript
Title Page