|
NeuroCOLT
Technical Report NC-TR-97-006
On the Well-Behavedness
of Important Attribute Evaluation Functions
Tapio
Elomaa
University of Helsinki
Finland
Juho
Rousu
VTT Biotechnology and Food Research
Finland
Abstract
The class of well-behaved evaluation functions simplifies
and makes efficient the handling of numerical attributes; for them
it suffices to concentrate on the {\em\bp s} in searching for the
optimal partition. This holds always for binary partitions and also
for multisplits if only the function is cumulative in addition
to being well-behaved. The class of well-behaved evaluation functions
is a proper superclass of convex evaluation functions. Thus, it is
clear that a large proportion of the most important attribute evaluation
functions are well-behaved. This paper explores the extent and boundaries
of well-behaved functions. In particular, we examine the convexity
and well-behavedness of C4.5's default attribute evaluation function
gain ratio, which has been known to have problems with numerical
attributes. Our empirical experiments show that a very simple cumulative
rectification to the poor bias of information gain significantly
outperforms gain ratio.
Download Compressed Postscript
|