UCSC-CRL-90-04: CHARACTERIZING PAC-LEARNABILITY OF SEMILINEAR SETS

Authors

Published On 12/01/1989 09:00 AM

Department Computer Science

Abstract We characterize learnability of the class of letter-counts of regular languages (semilinear sets), commutative regular languages and other related classes of subsets of N^m , with respect to the distribution-free learning model of Valiant (PAC-learning model). By utilizing the notion of reducibility among learning problems due to Pitt and Warmuth called `Prediction Preserving Reducibility\', and a special case thereof, we show a number of positive as well as partially negative results. On the positive side, we show that the class of semilinear sets of dimensions 1 and 2 is learnable when the integers are encoded in unary. We also show that the entire class of commutative deterministic finite-state automata on an arbitrary alphabet size is learnable. On the neutral to negative side, we show that, when the integers are encoded in binary, the learning problem for semilinear sets as well as a class of subsets of N^m much simpler than semilinear sets is as hard as learning DNF, a central open problem in the field. We also show that all of our positive results are for concept representation classes for which the problem of finding a minimum consistent representation is NP-hard.

Download

This report is not available for download at this time.