H

A

P

T

E

R

Finite-State Machines and Pushdown Automata

The ﬁnite-state machine (FSM) and the pushdown automaton (PDA) enjoy a special place in computer science. The FSM has proven to be a very useful model for many practical tasks and deserves to be among the tools of every practicing computer scientist. Many simple tasks, such as interpreting the commands typed into a keyboard or running a calculator, can be modeled by ﬁnite-state machines. The PDA is a model to which one appeals when writing compilers because it captures the essential architectural features needed to parse context-free languages, languages whose structure most closely resembles that of many programming languages. In this chapter we examine the language recognition capability of FSMs and PDAs. We show that FSMs recognize exactly the regular languages, languages deﬁned by regular expressions and generated by regular grammars. We also provide an algorithm to ﬁnd a FSM that is equivalent to a given FSM but has the fewest states. We examine language recognition by PDAs and show that PDAs recognize exactly the context-free languages, languages whose grammars satisfy less stringent requirements than regular grammars. Both regular and context-free grammar types are special cases of the phrasestructure grammars that are shown in Chapter 5 to be the languages accepted by Turing machines. It is desirable not only to classify languages by the architecture of machines that recognize them but also to have tests to show that a language is not of a particular type. For this reason we establish so-called pumping lemmas whose purpose is to show how strings in one language can be elongated or “pumped up.” Pumping up may reveal that a language does not fall into a presumed language category. We also develop other properties of languages that provide mechanisms for distinguishing among language types. Because of the importance of context-free languages, we examine how they are parsed, a key step in programming language translation.

153

154

Chapter 4 Finite-State Machines and Pushdown Automata Models of Computation

4.1 Finite-State Machine Models

The deterministic ﬁnite-state machine (DFSM), introduced in Section 3.1, has a set of states, including an initial state and one or more ﬁnal states. At each unit of time a DFSM is given a letter from its input alphabet. This causes the machine to move from its current state to a potentially new state. While in a state, the DFSM produces a letter from its output alphabet. Such a machine computes the function deﬁned by the mapping from strings of input letters to strings of output letters. DFSMs can also be used to accept strings. A string is accepted by a DFSM if the last state entered by the machine on that input string is a ﬁnal state. The language recognized by a DFSM is the set of strings that it accepts. Although there are languages that cannot be accepted by any machine with a ﬁnite number of states, it is important to note that all realistic computational problems are ﬁnite in nature and can be solved by FSMs. However, important opportunities to simplify computations may be missed if we do not view them as requiring potentially inﬁnite storage, such as that provided by pushdown automata, machines that store data on a pushdown stack. (Pushdown automata are formally introduced in Section 4.8.) The nondeterministic ﬁnite-state machine (NFSM) was also introduced in Section 3.1. The NFSM has the property that for a given state and input letter there may be several states to which it could move. Also for some state and input letter there may be no possible move. We say that an NFSM accepts a string if there is a sequence of next-state choices (see Section 3.1.5) that can be made, when necessary, so that the string causes the NFSM to enter a ﬁnal state. The language accepted by such a machine is the set of strings it accepts. Although nondeterminism is a useful tool in describing languages and...