TR 97-07 Rajesh Parekh and Vasant Honavar March 18, 1997

ACM Computing Classi cation System Categories 1991: Keywords:

I.2.6 Arti cial Intelligence Learning | language acquisition, concept learning; F.1.1 Theory of Computation Models of Computation | Automata; F.1.3 Theory of Computation Complexity Classes | Machine-independent complexity. grammar inference, regular grammars, nite state automata, PAC learning, Kolmogorov complexity, simple distributions, universal distribution, language learning, polynomialtime learning algorithms. Arti cial Intelligence Research Group Department of Computer Science 226 Atanaso Hall Iowa Sate University Ames, Iowa. IA 50011-1040. USA

Learning DFA from Simple Examples

Rajesh Parekh and Vasant Honavar Department of Computer Science 226 Atanaso Hall Iowa State University Ames IA 50011. U.S.A. March 18, 1997 We present a framework for learning DFA from simple examples. We show that e cient PAC learning of DFA is possible if the class of distributions is restricted to simple distributions where a teacher might choose examples based on the knowledge of the target concept. This answers an open research question posed in Pitt's seminal paper: Are DFA's PAC-identi able if examples are drawn from the uniform distribution, or some other known simple distribution?. Our approach uses the RPNI algorithm for learning DFA from labeled examples. In particular, we describe an e cient learning algorithm for exact learning of the target DFA with high probability when a bound on the number of states N of the target DFA is known in advance. When N is not known, we show how this algorithm can be used for e cient PAC learning of DFAs.

fparekh|honavarg@cs.iastate.edu

Abstract

1 Introduction

The problem of learning a DFA with the smallest number of states that is consistent with a given sample i.e., the DFA accepts each positive example and rejects each negative example has been actively...

(1)