NeuroCOLT

Neural Networks and Computational Learning Theory

 

About NeuroCOLT

Papers Archive

1994 1995
1996 1997
1998 1999
2000 2001

Books

info@neurocolt.org

NeuroCOLT Technical Report NC-TR-96-036

Tight worst-case loss bounds for predicting with expert advice

David Haussler
University of California Santa Cruz
USA

Jyrki Kivinen
University of Helsinki
Finland

Manfred K. Warmuth
University of California Santa Cruz
USA

Abstract
We consider on-line algorithms for predicting binary or continuous-valued outcomes, when the algorithm has available the predictions made by $N$ experts. For a sequence of trials, we compute total losses for both the algorithm and the experts under a loss function. At the end of the trial sequence, we compare the total loss of the algorithm to the total loss of the best expert, \ie, the expert with the least loss on the particular trial sequence. For a large class of loss functions, with binary outcomes the total loss of the algorithm proposed by Vovk exceeds the total loss of the best expert at most by the amount $c\ln N$, where $c$ is a constant determined by the loss function. This upper bound does not depend on any assumptions on how the experts' predictions or the outcomes are generated, and the trial sequence can be arbitrarily long. We give a straightforward method for finding the correct value $c$ and show by a lower bound that for this value of $c$, the upper bound is asymptotically tight. The lower bound is based on a probabilistic adversary argument. The class of loss functions for which the $c\ln N$ upper bound holds includes the square loss, the logarithmic loss, and the Hellinger loss. We also consider another class of loss functions, including the absolute loss, for which we have an $\Omega\left(\sqrt{\ell\log N}\right)$ lower bound, where $\ell$ is the number of trials. We show that for the square and logarithmic loss functions, Vovk's algorithm achieves the same worst-case upper bounds with continuous-valued outcomes as with binary outcomes. For the absolute loss, we show how bounds earlier achieved for binary outcomes can be achieved with continuous-valued outcomes using a slightly more complicated algorithm.

Download Compressed Postscript