|
NeuroCOLT
Technical Report NC-TR-95-045
Learning Internal Representations
(Short Version)
Jonathan
Baxter
Royal Holloway, University of London
Abstract
Probably the most important problem in machine learning is the preliminary
biasing of a learner's hypothesis space so that it is small enough
to ensure good generalisation from reasonable training sets, yet large
enough that it contains a good solution to the problem being learnt.
In this paper a mechanism for automatically learning or biasing
the learner's hypothesis space is introduced. It works by first learning
an appropriate internal representation for a learning environment
and then using that representation to bias the learner's hypothesis
space for the learning of future tasks drawn from the same environment.
An internal representation must be learnt by sampling from many
similar tasks, not just a single task as occurs in ordinary machine
learning. It is proved that the number of examples $m$ per task
required to ensure good generalisation from a representation
learner obeys $m = O(a+b/n)$ where $n$ is the number of tasks being
learnt and $a$ and $b$ are constants. If the tasks are learnt independently
( i.e. without a common representation) then $m=O(a+b)$.
It is argued that for learning environments such as eech and character
recognition $b\gg a$ and hence representation learning in these environments
can potentially yield a drastic reduction in the number of examples
required per task. It is also proved that if $n = O(b)$ (with $m=O(a+b/n)$)
then the representation learnt will be good for learning novel tasks
from the same environment, and that the number of examples required
to generalise well on a novel task will be reduced to $O(a)$ (as opposed
to $O(a+b)$ if no representation is used). It is shown that
gradient descent can be used to train neural network representations
and the results of an experiment are reported in which a neural network
representation was learnt for an environment consisting of translationally
invariant Boolean functions.
Download Compressed
Postscript
|