NeuroCOLT

Neural Networks and Computational Learning Theory

 

About NeuroCOLT

Papers Archive

1994 1995
1996 1997
1998 1999
2000 2001

Books

info@neurocolt.org

NeuroCOLT Technical Report NC-TR-96-034

Theory and Applications of Agnostic PAC-Learning with Small Decision Trees

Peter Auer
University of California Santa Cruz
USA

Robert C. Holte
University of Ottawa
Canada

Wolfgang Maass
Technische Universitaet Graz
Austria

Abstract
We exhibit a theoretically founded algorithm $\Tii$ for agnostic PAC-learning of decision trees of at most 2 levels, whose computation time is almost linear in the size of the training set. We evaluate the performance of this learning algorithm $\Tii$ on 15 common ``real-world'' datasets, and show that for most of these datasets $\Tii$ provides simple decision trees with little or no loss in predictive power (compared with C4.5). In fact, for datasets with continuous attributes its error rate tends to be lower than that of C4.5. To the best of our knowledge this is the first time that a PAC-learning algorithm is shown to be applicable to ``real-world'' classification problems.  Since one can prove   that $\Tii$ is an agnostic PAC-learning algorithm, $\Tii$ is guaranteed   to produce close to optimal 2-level decision trees from sufficiently large training sets forany (!) distribution of data. In this regard $\Tii$ differs strongly from all other learning algorithms that are considered in applied machine learning, for which no guarantee  can be given about their performance on new   datasets.  We also demonstrate that this algorithm $\Tii$ can be used as a diagnostic tool for the investigation of the expressive limits of 2-level decision trees. Finally, T2, in combination with new bounds on the VC-dimension of decision trees of bounded depth that we derive, provides us now for the first time with the tools necessary for comparing learning curves of decision trees for ``real-world'' datasets with the theoretical estimates of PAC-learning theory.

 

Download Compressed Postscript