UCSC-CRL-90-04: CHARACTERIZING PAC-LEARNABILITY OF SEMILINEAR SETS

12/01/1989 09:00 AM
Computer Science
We characterize learnability of the class of letter-counts of regular languages (semilinear sets), commutative regular languages and other related classes of subsets of N^m , with respect to the distribution-free learning model of Valiant (PAC-learning model). By utilizing the notion of reducibility among learning problems due to Pitt and Warmuth called `Prediction Preserving Reducibility\', and a special case thereof, we show a number of positive as well as partially negative results. On the positive side, we show that the class of semilinear sets of dimensions 1 and 2 is learnable when the integers are encoded in unary. We also show that the entire class of commutative deterministic finite-state automata on an arbitrary alphabet size is learnable. On the neutral to negative side, we show that, when the integers are encoded in binary, the learning problem for semilinear sets as well as a class of subsets of N^m much simpler than semilinear sets is as hard as learning DNF, a central open problem in the field. We also show that all of our positive results are for concept representation classes for which the problem of finding a minimum consistent representation is NP-hard.

This report is not available for download at this time.