|
About
NeuroCOLT
Papers
Archive
Books
info@neurocolt.org
|
NeuroCOLT
Technical Report NC-TR-97-032
On Predictive
Distributions and Bayesian Networks
Petri
Kontkanen, Petri Myllymaki, Tom Silander and Henry Tirri
University of Helsinki
Finland
Peter
Grunwald
CWI, The Netherlands
Abstract
In this paper we are interested in discrete prediction problems for
a decision-theoretic setting, where the task is to compute the predictive
distribution for a finite set of possible alternatives. This question
is first addressed in a general framework, where we consider a set
of probability distributions defined by some parametric model class.
The standard Bayesian approach is to compute the posterior probability
for the model parameters, given a prior distribution and sample data,
and fix the parameters to the instantiation with the maximum
a posteriori probability. A more accurate predictive distribution
can be obtained by comupting the evidence, i.e., the integral
over all the individual parameter instantiations. As an alternative
to these two approaches, we demonstrate how to use Rissanen's new
definition of stochastic complexity for determining predictive
distributions. We then describe how these predictive inference methods
can be realized in the case of Bayesian networks. In particular, we
demonstrate the use of Jeffrey's prior as the prior distribution for
computing the evidence predictive distribution. It can be shown that
the evidence predictive distribution with Jeffrey's prior approaches
the new stochastic complexity predictive distribution in the limit
with increasing amount of sample data. For computational reasons in
the experimental part of the paper the three predictive distributions
are compared by using the tree-structures simple Naive Bayes model.
The experimentation with several public domain classification datasets
suggest that the evidence approach produces the most accurate predictions
in the log-score sense, especially with small training sets.
Download Compressed
Postscript
Title
Page
|