11/01/1994 09:00 AM
Computer Science
We consider on-line algorithms for predicting binary or continuous-valued outcomes, when the algorithm has available the predictions made by N experts. For a sequence of trials, we compute total losses for both the algorithm and the experts under a loss function. At the end of the trial sequence, we compare the total loss of the algorithm to the total loss of the best expert, i.e., the expert with the least loss on the particular trial sequence. We show that for a large class of loss functions, with binary outcomes the total loss of the algorithm proposed by Vovk exceeds the total loss of the best expert at most by the amount c ln N, where c is a constant determined by the loss function. This upper bound does not depend on any assumptions on how the experts\' predictions or the outcomes are generated, and the trial sequence can be arbitrarily long. We give a straightforward method for finding the correct value c and show by a lower bound that for this value of c, the upper bound is asymptotically tight. The lower bound is based on a probabilistic adversary argument. The class of loss functions for which the c ln N upper bound holds includes the square loss, the logarithmic loss, and the Hellinger loss. We also consider another class of loss functions, including the absolute loss, for which we have an Omega((l log N)^(1/2)) lower bound, where l is the number of trials. We show that for the square and logarithmic loss functions, Vovk\'s algorithm achieves the same worst-case upper bounds with continuous-valued outcomes as with binary outcomes. For the absolute loss, we show how bounds earlier achieved for binary outcomes can be achieved with continuous-valued outcomes using a slightly more complicated algorithm.