NeuroCOLT

Neural Networks and Computational Learning Theory

 

About NeuroCOLT

Papers Archive

1994 1995
1996 1997
1998 1999
2000 2001

Books

info@neurocolt.org

NeuroCOLT Technical Report NC-TR-96-053

Structural Risk Minimization over Data-Dependent Hierarchies

John Shawe-Taylor
Royal Holloway, University of London
UK

Peter Bartlett
Australian National University
Australia

Robert Williamson
Australian National University
Australia

Martin Anthony
London School of Economics
UK

Abstract
The paper introduces some generalizations of Vapnik's method of structural risk minimisation (SRM). As well as making explicit some of the details on SRM, it provides a result that allows one to trade off errors on the training sample against improved generalization performance. It then considers the more general case when the hierarchy of classes is chosen in response to the data. A result is presented on the generalization performance of classifiers with a ``large margin''. This theoretically explains the impressive generalization performance of the maximal margin hyperplane algorithm of Vapnik and co-workers (which is the basis for their support vector machines). The paper concludes with a more general result in terms of ``luckiness'' functions, which provides a quite general way for exploiting serendipitous simplicity in observed data to obtain better prediction accuracy from small training sets. Four examples are given of such functions, including the VC dimension measured on the sample.

 

Download Compressed Postscript