|
About
NeuroCOLT
Papers
Archive
Books
info@neurocolt.org
|
NeuroCOLT
Technical Report NC-TR-95-023
PAC
Learning and Artificial Neural Networks
Martin
Anthony and Norman Biggs
London School of Economics and Political Science
University of London
Abstract
In this article, we discuss the `probably approximately correct'
(PAC) learning paradigm as it applies to artificial neural networks.
The PAC learning model is a probabilistic framework for the study
of learning and generalization. It is useful not only for neural classification
problems, but also for learning problems more often associated with
mainstream artificial intelligence, such as the inference of Boolean
functions. In PAC theory, the notion of succesful learning is formally
defined using probability theory. Very roughly speaking, if a large
enough sample of randomly drawn training examples is presented, then
it should be likely that, after learning, the neural network will
classify most other randomly drawn examples correctly. The PAC model
formalises the terms `likely' and `most'. Furthermore, the learning
algorithm must be expected to act quickly, since otherwise it may
be of little use in practice.
There are thus two main emphases in PAC learning theory. First,
there is the issue of how many training examples should be presented.
Secondly, there is the question of whether learning can be achieved
using a fast algorithm.
These are known, respectively, as the {\it sample complexity}
and {\it computational complexity} problems. This article provides
a brief introduction to these. We highlight the importance of the
Vapnik-Chervonenkis dimension, a combinatorial parameter which measures
the `expressive power' of a neural network, and describe how this
parameter quantifies fairly precisely the sample complexity of PAC
learning. In discussing the computational complexity of PAC learning,
we shall present a result which illustrates that in some cases the
problem of PAC learning is inherently intractable.
Download Compressed
Postscript
|