UCSC-CRL-91-26: TRACKING DRIFTING CONCEPTS BY MINIMIZING DISAGREEMENTS

08/01/1991 09:00 AM
Computer Science
In this paper we consider the problem of tracking a subset of a domain (called the *target*) which changes gradually over time. A single (unknown) probability distribution over the domain is used to generate random examples for the learning algorithm and measure the speed at which the target changes. We show that if the problem of minimizing the number of disagreements with a sample from among concepts in a class H can be approximated to within a factor k , then there is a simple tracking algorithm for H which can achieve a probability epsilon of making a mistake if the target movement rate is at most a constant times epsilon^{2}/(k(d+k) ln {1 over epsilon}), where d is the Vapnik-Chervonenkis dimension of H . Also, we show that if H is properly PAC-learnable, then there is an efficient (randomized) algorithm that with high probability approximately minimizes disagreements to within a factor of 7d+1, yielding an efficient tracking algorithm for H which tolerates drift rates up to a constant times epsilon^{2}/(d^2 ln {1 over epsilon}). In addition, we prove complementary results for the classes of halfspaces and axis-aligned hyperrectangles showing that the maximum rate of drift that any algorithm (even with unlimited computational power) can tolerate is a constant times epsilon^ {2}/d.

UCSC-CRL-91-26