|
About
NeuroCOLT
Papers
Archive
Books
info@neurocolt.org
|
NeuroCOLT
Technical Report NC-TR-95-052
Computational
Machine Learning in Theory and Praxis
Ming
Li
University of Waterloo
Canada
Paul
Vitanyi
CWI and Universiteit van Amsterdam
The Netherlands
Abstract
In the last few decades a computational approach to machine learning
has emerged based on paradigms from recursion theory and the theory
of computation. Such ideas include learning in the limit, learning
by enumeration, and probably approximately correct (pac) learning.
These models usually are not suitable in practical situations. In
contrast, statistics based inference methods have enjoyed a long and
distinguished career. Currently, Bayesian reasoning in various forms,
minimum message length (MML) and minimum description length (MDL),
are widely applied approaches. They are the tools to use with particular
machine learning praxis such as simulated annealing, genetic algorithms,
genetic programming, artificial neural networks, and the like. These
statistical inference methods select the hypothesis which minimizes
the sum of the length of the description of the hypothesis (also called
`model') and the length of the description of the data relative to
the hypothesis. It appears to us that the future of computational
machine learning will include combinations of the approaches above
coupled with guaranties with respect to used time and memory resources.
Computational learning theory will move closer to practice and the
application of the principles such as MDL require further justification.
Here, we survey some of the actors in this dichotomy between theory
and praxis, we justify MDL via the Bayesian approach, and give a comparison
between pac learning and MDL learning of decision trees.
Download Compressed
Postscript
|