|
About
NeuroCOLT
Papers
Archive
Books
info@neurocolt.org
|
NeuroCOLT
Technical Report NC-TR-00-086
2000-086
Combining Discriminant Models with
New Multiclass SVMs
Yann Guermeur
LORIA Campus Scientifique Yann.Guermeur@loria.fr
The idea of combining models instead of simply selecting
the ``best'' one, in order to improve performance, is well known in
statistics and has a long theoretical background. However, making
full use of theoretical results is ordinarily subject to the satisfaction
of strong hypotheses (weak correlation among the errors, availability
of large training sets, possibility to rerun the training procedure
an arbitrary number of times, etc.). In contrast, the practitioner
who has to make a decision is frequently faced with the difficult
problem of combining a given set of pretrained classifiers, with highly
correlated errors, using only a small training sample. Overfitting
is then the main risk, which cannot be overcome but with a strict
complexity control of the combiner selected. This suggests that SVMs,
which implement the SRM inductive principle, should be well suited
for these difficult situations. Investigating this idea, we introduce
a new family of multi-class SVMs and assess them as ensemble methods
on a real-world problem.This task, protein secondary structure prediction,
is an open problem in biocomputing for which model combination appears
to be an issue of central importance. Experimental evidence highlights
the gain in quality resulting from combining some of the most widely
used prediction methods with our SVMs rather than with the ensemble
methods traditionally used in the field. The gain is increased when
the outputs of the combiners are post-processed with a simple DP algorithm.
Download
Compressed Postscript
|