|
NeuroCOLT
Technical Report NC-TR-00-085
Sparse
Regression Ensembles in Infinite and Finite Hypothesis Spaces
G. Raetsch, A. Demiriz and K. Bennett
Abstract
We examine methods for constructing regression ensembles based
on a linear program (LP). The ensemble regression function consists
of linear combinations of base hypotheses generated by some boosting-type
base learning algorithm. Unlike the classification case as in AdaBoost,
for regression the set of possible hypotheses producible by the base
learning algorithm may be infinite. We explicitly tackle the issue
of how to define and solve ensemble regression when the hypothesis
space is infinite. Our approach is based on a semi-infinite linear
program that has an infinite number of constraints and a finite number
of variables. We show that the regression problem is well posed for
infinite hypothesis spaces in both the primal and dual spaces. Most
importantly, we prove there exists an optimal solution to the infinite
hypothesis space problem consisting of a finite number of hypothesis.
We propose two algorithms for solving the infinite and finite hypothesis
problems. One uses column generation simplex-type algorithm and the
other adopts an exponential barrier approach. Furthermore, we give
sufficient conditions on the base learning algoritm and the hypothesis
set to be used for infinite regression ensembles. Computational results
show that these methods are extremely promising.
Download
Compressed Postscript
|