TR 97-07 Rajesh Parekh and Vasant Honavar March 18, 1997

ACM Computing Classi cation System Categories 1991: Keywords:

I.2.6 Arti cial Intelligence Learning | language acquisition, concept learning; F.1.1 Theory of Computation Models of Computation | Automata; F.1.3 Theory of Computation Complexity Classes | Machine-independent complexity. grammar inference, regular grammars, nite state automata, PAC learning, Kolmogorov complexity, simple distributions, universal distribution, language learning, polynomialtime learning algorithms. Arti cial Intelligence Research Group Department of Computer Science 226 Atanaso Hall Iowa Sate University Ames, Iowa. IA 50011-1040. USA

Learning DFA from Simple Examples

Rajesh Parekh and Vasant Honavar Department of Computer Science 226 Atanaso Hall Iowa State University Ames IA 50011. U.S.A. March 18, 1997 We present a framework for learning DFA from simple examples. We show that e cient PAC learning of DFA is possible if the class of distributions is restricted to simple distributions where a teacher might choose examples based on the knowledge of the target concept. This answers an open research question posed in Pitt's seminal paper: Are DFA's PAC-identi able if examples are drawn from the uniform distribution, or some other known simple distribution?. Our approach uses the RPNI algorithm for learning DFA from labeled examples. In particular, we describe an e cient learning algorithm for exact learning of the target DFA with high probability when a bound on the number of states N of the target DFA is known in advance. When N is not known, we show how this algorithm can be used for e cient PAC learning of DFAs.

fparekh|honavarg@cs.iastate.edu

Abstract

1 Introduction

The problem of learning a DFA with the smallest number of states that is consistent with a given sample i.e., the DFA accepts each positive example and rejects each negative example has been actively studied for over two decades. DFAs are recognizers for regular languages that are considered to be the simplest class of languages in the formal language hierarchy Chomsky, 1956; Hopcroft & Ullman, 1979 . An understanding of the issues and pitfalls encountered during the learning of regular languages or equivalently, identi cation of the corresponding DFA might provide insights into the problem of learning more general classes of languages. Exact learning of the target DFA from an arbitrary presentation of labeled examples is a hard problem Gold, 1978 . Gold has shown that the problem of identi cation of the 1

minimum state DFA consistent with a presentation S comprising of a nite non-empty set of positive examples S + and possibly a nite non-empty set of negative examples S , is NP -hard. Under the standard complexity theoretic assumption P 6= NP , Pitt and Warmuth have shown that no polynomial time algorithm can be guaranteed to produce a DFA with at most n1, loglogn states from a set of labeled examples corresponding to a DFA with n states Pitt & Warmuth, 1988 . E cient learning algorithms for identi cation of DFA assume that additional information is provided to the learner. Trakhtenbrot and Barzdin have described a polynomial time algorithm for constructing the smallest DFA consistent with a complete labeled sample i.e., a sample that includes all strings up to a particular length and the corresponding label that states whether the string is accepted by the target or not Trakhtenbrot & Barzdin, 1973 . Angluin has shown that given a live-complete set of examples that contains a representative string for each live state of the target DFA and a knowledgeable teacher to answer membership queries it is possible to exactly learn the target DFA Angluin, 1981 . In a later paper, Angluin has relaxed the requirement of a live-complete set and has designed a polynomial time inference algorithm using both membership and equivalence queries Angluin, 1987 . The RPNI...