Equivalence of Finite Automata and Regular Expressions

Finite Automata Recognize Regular Languages Theorem 1. L is a regular language iﬀ there is a regular expression R such that L(R) = L iﬀ there is a DFA M such that L(M ) = L iﬀ there is a NFA N such that L(N ) = L. i.e., regular expressions, DFAs and NFAs have the same computational power. Proof. • Given regular expression R, will construct NFA N such that L(N ) = L(R)

• Given DFA M , will construct regular expression R such that L(M ) = L(R)

2

Regular Expressions to NFA

Regular Expressions to Finite Automata . . . to Non-determinstic Finite Automata Lemma 2. For any regex R, there is an NFA NR s.t. L(NR ) = L(R). Proof Idea We will build the NFA NR for R, inductively, based on the number of operators in R, #(R). • Base Case: #(R) = 0 means that R is ∅, , or a (from some a ∈ Σ). We will build NFAs for these cases. • Induction Hypothesis: Assume that for regular expressions R, with #(R) < n, there is an NFA NR s.t. L(NR ) = L(R). • Induction Step: Consider R with #(R) = n. Based on the form of R, the NFA NR will be built using the induction hypothesis.

Regular Expression to NFA Base Cases If R is an elementary regular expression, NFA NR is constructed as follows. R=∅ q0

R= q0

q0 a

R=a

q1

1

Induction Step: Union Case R = R1 ∪ R2 By induction hypothesis, there are N1 , N2 s.t. L(N1 ) = L(R1 ) and L(N2 ) = L(R2 ). Build NFA N s.t. L(N ) = L(N1 ) ∪ L(N2 ) q12 q1 q11 q0

q2

q21

Figure 1: NFA for L(N1 ) ∪ L(N2 )

Induction Step: Union Formal Deﬁnition Case R = R1 ∪ R2 Let N1 = (Q1 , Σ, δ1 , q1 , F1 ) and N2 = (Q2 , Σ, δ2 , q2 , F2 ) (with Q1 ∩ Q2 = ∅) be such that L(N1 ) = L(R1 ) and L(N2 ) = L(R2 ). The NFA N = (Q, Σ, δ, q0 , F ) is given by • Q = Q1 ∪ Q2 ∪ {q0 }, where q0 ∈ Q1 ∪ Q2 • F = F1 ∪ F2 • δ is deﬁned as follows δ1 (q, a) δ2 (q, a) δ(q, a) = {q1 , q2 } ∅ Induction Step: Union Correctness Proof Need to show that w ∈ L(N ) iﬀ w ∈ L(N1 ) ∪ L(N2 ). ⇒ w ∈ L(N ) implies q0 −→N q for some q ∈ F . Based on the transitions out of q0 , q0 −→N w w w q1 −→N q or q0 −→N q2 −→N q. Consider q0 −→N q1 −→N q. (Other case is similar) This w means q1 −→N1 q (as N has the same transition as N1 on the states in Q1 ) and q ∈ F1 . This means w ∈ L(N1 ). 2 w

if q ∈ Q1 if q ∈ Q2 if q = q0 and a = otherwise

⇐ w ∈ L(N1 ) ∪ L(N2 ). Consider w ∈ L(N1 ); case of w ∈ L(N2 ) is similar. Then, q1 −→N1 q for w some q ∈ F1 . Thus, q0 −→N q1 −→N q, and q ∈ F . This means that w ∈ L(N ).

w

Induction Step: Concatenation Case R = R1 ◦ R2 • By induction hypothesis, there are N1 , N2 s.t. L(N1 ) = L(R1 ) and L(N2 ) = L(R2 ) • Build NFA N s.t. L(N ) = L(N1 ) ◦ L(N2 )

q11 q1 q12 q2 q21

Figure 2: NFA for L(N1 ) ◦ L(N2 )

Induction Step: Concatenation Formal Deﬁnition Case R = R1 ◦ R2 Let N1 = (Q1 , Σ, δ1 , q1 , F1 ) and N2 = (Q2 , Σ, δ2 , q2 , F2 ) (with Q1 ∩ Q2 = ∅) be such that L(N1 ) = L(R1 ) and L(N2 ) = L(R2 ). The NFA N = (Q, Σ, δ, q0 , F ) is given by • Q = Q1 ∪ Q2 • q0 = q1 • F = F2 • δ is deﬁned as follows δ1 (q, a) δ1 (q, a) ∪ {q2 } δ(q, a) = δ2 (q, a) ∅ if q ∈ (Q1 \ F1 ) or a = if q ∈ F1 and a = if q ∈ Q2 otherwise

Induction Step: Concatenation Correctness Proof Need to show that w ∈ L(N ) iﬀ w ∈ L(N1 ) ◦ L(N2 ). 3

w ∈ L(N ) iﬀ q0 −→N q for some q ∈ F = F2 . The computation of N on w starts in a state of N1 (namely, q0 = q1 ) and ends in a state of N2 (namely, q ∈ F2 ). The only transitions from a state of N1 to a state of N2 is from a state in F1 which have -transitions to q2 , the initial state of N2 . Thus, we have q0 = q1 −→N q with q ∈ F = F2 iﬀ v u ∗ . w = uv and q = q −→ q −→ q −→ q ∃q ∈ F1 . ∃u, v ∈ Σ 0 1 N N N 2 This means that q1 −→N1 q (with q ∈ F1 ) and q2 −→N2 q (with q ∈ F2 ). Hence, u ∈ L(N1 ) and v ∈ L(N2 ), and so w = uv ∈ L(N1 ) ◦ L(N2 ). Conversely, if u ∈ L(N1 ) and v ∈ L(N2 ) then for some v u q ∈ F1 and q ∈ F2 , we have q1 −→N1 q and q2 −→N2 q. Then, q0 = q1 −→N q −→N q2...