|
NeuroCOLT
Technical Report NC-TR-96-053
Structural
Risk Minimization over Data-Dependent Hierarchies
John
Shawe-Taylor
Royal Holloway, University of London
UK
Peter
Bartlett
Australian National University
Australia
Robert
Williamson
Australian National University
Australia
Martin
Anthony
London School of Economics
UK
Abstract
The paper introduces some generalizations of Vapnik's method
of structural risk minimisation (SRM). As well as making explicit
some of the details on SRM, it provides a result that allows one to
trade off errors on the training sample against improved generalization
performance. It then considers the more general case when the hierarchy
of classes is chosen in response to the data. A result is presented
on the generalization performance of classifiers with a ``large margin''.
This theoretically explains the impressive generalization performance
of the maximal margin hyperplane algorithm of Vapnik and co-workers
(which is the basis for their support vector machines). The paper
concludes with a more general result in terms of ``luckiness'' functions,
which provides a quite general way for exploiting serendipitous simplicity
in observed data to obtain better prediction accuracy from small training
sets. Four examples are given of such functions, including the VC
dimension measured on the sample.
Download Compressed
Postscript
|