A Mathematical Approach to Context-Free Languages

Chac Kwan

doi:10.20944/preprints202506.0158.v1

Submitted:

30 May 2025

Posted:

03 June 2025

You are already at the latest version

Abstract

Students are getting confused and losing interest in theoretical computer science because most instructors are doing a poor job in teaching the subject matter. Instructors are doing a poor job in teaching because they do not have a well-organized theory to explain the concepts and they are unwilling to spend the time to write up better lecture notes for the class. This paper presents a rigorous mathematical approach to the theory of context-free languages which doesn’t currently exist in the literature of theoretical computer science. Basic definitions are developed in mathematical terms and used as the foundation for constructing mathematical proofs for theorems. It provides a model for instructors to write better lecture notes and authors to write better textbooks for educational purpose. It also corrects some critical errors and erroneous arguments that can be found in many textbooks which are widely used for the education of theoretical computer science. Students can use this paper for supplemental reading.

Keywords:

Theoretical Computer Science

;

Computability Theory

;

Context Free Languages

;

Finite Automata

;

Pushdown Automata

;

Pumping Lemma

;

Chomsky Normal Form

;

Discrete Mathematics

Subject:

Computer Science and Mathematics - Computer Science

2.1. Context-Free Grammars (CFG)

In Chapter 1, we use finite automata and regular expressions to describe regular languages. In this chapter, we introduce the concept of Context-Free Grammar which is a more powerful tool for describing languages.

A Context-Free Grammar is formally defined as follows.

Definition 2.1.

A Context-Free Grammar denoted by

C F G

is a 4-tuple

G = (V, Σ, R, S)

, where

(i): $V$ is a finite set of variables;
(ii): $Σ$ is a finite set of terminals such that $V \cap Σ = \emptyset$ ;
(iii): $S \in V$ is the start variable; and
(iv): $R \subset V \times {(V \cup Σ)}^{*}$ is a finite relation

For any

(A, u) \in R

, we usually write

A \to u

and call it a rule.

Accordingly, the relation

R

is also called the set of rules for the

C F G

.

A

is sometimes called the head of the rule whereas

u

is called the body of the rule.

Example 2.2.

Let

V = {< S E N T E N S E >, < N O U N P H R A S E >, < V E R B P H R A S E >, < P R E P P H R A S E >,

< C M P L X N O U N >, < C M P L X V E R B >, < P R E P >, < A R T I C L E >, < N O U N >, < V E R B >}

Σ = {a, t h e, b o y, g i r l, f l o w e r, t o u c h e s, l i k e s, s e e s, w i t h}

S = < S E N T E N S E >

Let

R

consist of the following rules:

< S E N T E N S E > ⟶ < N O U N P H R A S E > < V E R B P H R A S E >

< N O U N P H R A S E > ⟶ < C M P L X N O U N > | < C M P L X N O U N > < P R E P P H R A S E >

< V E R B P H R A S E > ⟶ < C M P L X V E R B > | < C M P L X V E R B > < P R E P P H R A S E >

< P R E P P H R A S E > ⟶ < P R E P > < C M P L X N O U N >

< C M P L X N O U N > ⟶ < A R T I C L E > < N O U N >

< C M P L X V E R B > ⟶ < V E R B > | < V E R B > < N O U N P H R A S E >

< A R T I C L E > ⟶ a | t h e

< N O U N > ⟶ b o y |g i r l| f l o w e r

< V E R B > ⟶ t o u c h e s |l i k e s| s e e s

< P R E P > ⟶ w i t h

G = (V, Σ, R, S)

is a

C F G

.

The following are examples of strings in

Σ^{*}

that can be derived by

G

.

(i) < S E N T E N S E >

⟶ < N O U N P H R A S E > < V E R B P H R A S E >

⟶ < C M P L X N O U N > < V E R B P H R A S E >

⟶ < A R T I C L E > < N O U N > < V E R B P H R A S E >

⟶ a < N O U N > < V E R B P H R A S E >

⟶ a b o y < V E R B P H R A S E >

⟶ a b o y < C M P L X V E R B >

⟶ a b o y < V E R B >

⟶ a b o y s e e s

(i i) < S E N T E N S E >

⟶ < N O U N P H R A S E > < V E R B P H R A S E >

⟶ < C M P L X N O U N > < V E R B P H R A S E >

⟶ < A R T I C L E > < N O U N > < V E R B P H R A S E >

⟶ t h e b o y < V E R B P H R A S E >

⟶ t h e b o y < C M P L X V E R B >

⟶ t h e b o y < V E R B > < N O U N P H R A S E >

⟶ t h e b o y s e e s < C M P L X N O U N >

⟶ t h e b o y s e e s < A R T I C L E > < N O U N >

⟶ t h e b o y s e e s a f l o w e r

(i i i) < S E N T E N S E >

⟶ < N O U N P H R A S E > < V E R B P H R A S E >

⟶ < C M P L X N O U N > < P R E P P H R A S E > < V E R B P H R A S E >

⟶ < A R T I C L E > < N O U N > < P R E P > < C M P L X N O U N > < V E R B P H R A S E >

⟶ a g i r l w i t h < A R T I C L E > < N O U N > < V E R B P H R A S E >

⟶ a g i r l w i t h a f l o w e r < C M P L X V E R B >

⟶ a g i r l w i t h a f l o w e r < V E R B > < N O U N P H R A S E >

⟶ a g i r l w i t h a f l o w e r l i k e s < C M P L X N O U N >

⟶ a g i r l w i t h a f l o w e r l i k e s < A R T I C L E > < N O U N >

⟶ a g i r l w i t h a f l o w e r l i k e s t h e b o y

Definition 2.3.

Let

G = (V, Σ, R, S)

be a

C F G

.

For any

u, v \in {(V \cup Σ)}^{*}

, we say

u

yields

v

(or

v

is derivable from

u

) in one step (written as

u \overset{1}{\Rightarrow} v

or simply

u ⟹ v

) if and only if

\exists A \in V, γ, α, β \in {(V \cup Σ)}^{*}

and a rule

A ⟶ γ

such that

u = α A β

and

v = α γ β

.

Note that the process of deriving

v

from

u

is basically a replacement of a variable in

u

by the body of the variable’s rule to obtain

v

.

In addition, we define

u \overset{0}{\Rightarrow} v

iff

u = v

.

For any integer

n \geq 0

, we say

u

yields

v

(or

v

is derivable from

u

) in

n + 1

steps (written as

u \overset{n + 1}{\Rightarrow} v

) iff

\exists w \in {(V \cup Σ)}^{*}

such that

u \overset{n}{\Rightarrow} w

and

w \overset{1}{\Rightarrow} v

.

If there are more than one

C F G

to be considered, (e.g.

G

and

G^{'}

) and if we need to distinguish between derivations in

G

from derivations in

G^{'}

, we can write

u \overset{n, G}{\Rightarrow} v

to mean

v

is derivable from

u

in

n

steps by use of rules in

G

; and

u \overset{n, G^{'}}{\Rightarrow} v

to mean

v

is derivable from

u

in

n

steps by use of rules in

G^{'}

.

Furthermore, if we need to specify the rule to be applied in each step, we can use

u \overset{n, G, (R_{1}, R_{2}, \dots R_{n})}{\Rightarrow} v

to mean

v

is derivable from

u

in

n

steps by use of rules in

G

with rule

R_{i}

to be applied in the

i^{t h}

step; and

u \overset{n, G^{'}, (R_{1}^{'}, R_{2}^{'}, \dots R_{n}^{'})}{\Rightarrow} v

to mean

v

is derivable from

u

in

n

steps by use of rules in

G^{'}

with rule

R_{i}^{'}

to be applied in the

i^{t h}

step.

Since there can be more than one way of deriving a string, it is sometimes useful to require the derivation to be leftmost. A leftmost derivation is a derivation in which the leftmost variable at every step is replaced by the body of its rule.

Formally, we define leftmost derivation as follows.

For any

u, v \in {(V \cup Σ)}^{*}

,

v

is a leftmost derivation of

u

in one step (written as

u \overset{1, l m}{\Rightarrow} v

or simply

u \overset{l m}{\Rightarrow} v

) iff

\exists w \in Σ^{*}

,

w^{'} \in {(V \cup Σ)}^{*}

,

A \in V, α \in {(V \cup Σ)}^{*}

and a rule

A ⟶ α

such that

u = w A w^{'}

and

v = w α w^{'}

.

For any integer

n \geq 0, u \overset{n, l m}{\Rightarrow} v

is defined similarly as

u \overset{n}{\Rightarrow} v

.

Definition 2.4

. Let

G = (V, Σ, R, S)

be a

C F G

;

\overset{*}{\Rightarrow}

be a subset of

{(V \cup Σ)}^{*} \times {(V \cup Σ)}^{*}

.

We define the relation

\overset{*}{\Rightarrow}

as follows:

\forall u, v \in {(V \cup Σ)}^{*}, u \overset{*}{\Rightarrow} v

if and only if

u \overset{n}{\Rightarrow} v

for some integer

n \geq 0

.

n

is called the length of the derivation of

v

from

u

.

Note that whenever there is an

n

such that

u \overset{n}{\Rightarrow} v

, there is a minimum

n^{'}

such that

u \overset{n^{'}}{\Rightarrow} v

.

If there are more than one

C F G

to be considered,

u \overset{*, G}{\Rightarrow} v

if and only if

u \overset{n, G}{\Rightarrow} v

for some integer

n \geq 0

.

\overset{*, l m}{\Rightarrow}

is defined similarly as

\overset{*}{\Rightarrow}

.

Proposition 2.5.

\overset{*}{\Rightarrow}

(respectively

\overset{*, l m}{\Rightarrow}

) is reflexive and transitive.

Proof.

Since

u \overset{0}{\Rightarrow} u

for all

u \in {(V \cup Σ)}^{*}

,

u \overset{*}{\Rightarrow} u

for all

u \in {(V \cup Σ)}^{*}

.

Therefore,

\overset{*}{\Rightarrow}

is reflexive.

For transitivity, assume

u \overset{*}{\Rightarrow} v

and

v \overset{*}{\Rightarrow} w

.

There exist integers

m \geq 0

and

n \geq 0

such that

u \overset{m}{\Rightarrow} v

and

v \overset{n}{\Rightarrow} w

.

There are two cases to examine,

n = 0

or

n \neq 0

(i) n = 0

v \overset{0}{\Rightarrow} w

By definition,

v = w

.

Since

u \overset{m}{\Rightarrow} v

,

u \overset{m}{\Rightarrow} w

.

Therefore,

u \overset{*}{\Rightarrow} w .

(i i) n \neq 0

v \overset{n}{\Rightarrow} w

v \overset{n - 1}{\Rightarrow} α_{n - 1} \overset{1}{\Rightarrow} w

for some

α_{n - 1} \in {(V \cup Σ)}^{*}

.

With a backward induction argument, we have

v ⟹ α_{1} ⟹ α_{2} \dots \dots \dots ⟹ α_{n - 1} ⟹ w

for some

{α_{1,} α_{2,} \dots α}_{n - 1} \in {(V \cup Σ)}^{*}

.

We now have

u \overset{m}{\Rightarrow} v ⟹ α_{1} ⟹ α_{2} \dots \dots \dots ⟹ α_{n - 1} ⟹ w

.

Since

(u \overset{m}{\Rightarrow} v ⟹ α_{1}) ⟹ (u \overset{m + 1}{\Rightarrow} α_{1})

,

(u \overset{m + 1}{\Rightarrow} α_{1} ⟹ α_{2}) ⟹ (u \overset{m + 2}{\Rightarrow} α_{2})

,

⋮

With a forward induction argument, we have

u \overset{m + n - 1}{\Rightarrow} α_{n - 1}

.

Finally,

(u \overset{m + n - 1}{\Rightarrow} α_{n - 1} ⟹ w) ⟹ (u \overset{m + n}{\Rightarrow} w)

.

Therefore,

u \overset{*}{\Rightarrow} w

.

Combining (i) and (ii),

\overset{*}{\Rightarrow}

is transitive.

With a similar argument, we can establish

\overset{*, l m}{\Rightarrow}

is also reflexive and transitive.

Definition 2.6.

Let

G = (V, Σ, R, S)

be a

C F G

.

The language of

G

is defined as

L (G) = \{w {\in Σ}^{*}| S \overset{*}{\Rightarrow} w}

.

Note that if

S \overset{n}{\Rightarrow} w

,

n \geq 1

because

S \overset{0}{\Rightarrow} w

implies

S = w

which is a contradiction.

Definition 2.7.

Let

G = (V, Σ, R, S)

be a

C F G

.

Let

Q

represent the rule

A \to α

in

R

.

\forall u, v \in {(V \cup Σ)}^{*},

we say

u

yields

v

(or

v

is derivable from

u

) using the rule

Q

(written as

u \overset{Q}{\Rightarrow} v

) if and only if there exist

w_{1}, w_{2} \in {(V \cup Σ)}^{*}

such that

u = w_{1} A w_{2}

and

v = w_{1} α w_{2}

.

Proposition 2.8.

Let

G = (V, Σ, R, S)

be a

C F G

.

For any

A \in V,

α \in {(V \cup Σ)}^{*}

and

x, y, z \in {(V \cup Σ)}^{*}

,

(i): $(A ⟶ α) ⟺ (A ⟹ α)$
(ii): If there is no $α$ in ${(V \cup Σ)}^{*}$ such that $S ⟶ α$ is a rule, $L (G) = \emptyset$ .
(iii): If $\{α \in {(V \cup Σ)}^{*}| A ⟶ α\} = \emptyset$ and $x ⟹ y$ , then $A$ appears in $x ⟹ A$ appears in $y$ .
(iv): Let $S ⟹ u_{1} ⟹ u_{2} \dots \dots \dots ⟹ u_{n} ⟹ w$ , where $u_{i} \in {(V \cup Σ)}^{*}$ for all $i \in {1,2, 3, \dots n}$ , $w \in Σ^{*}$ and $n \geq 1$ .

If

A \in V

and

A

appears in

u_{i}

for some

i \in {1,2, 3, \dots n},

then

\exists α \in {(V \cup Σ)}^{*}

such that

A ⟶ α

is a rule.

Proof.

(i): If $A ⟶ α$ is a rule, since $ϵ$ is in ${(V \cup Σ)}^{*}$ , $ϵ A ϵ ⟹ ϵ α ϵ$ .

Therefore, $A ⟹ α$ .

Conversely if $A ⟹ α$ , $\exists Q \in V, β, w_{1}, w_{2} \in {(V \cup Σ)}^{*}$ such that $A = w_{1} Q w_{2}$ and

$α = w_{1} β w_{2}$ and $Q ⟶ β$ is a rule.

Since $A = w_{1} Q w_{2}$ , $w_{1}, w_{2}$ must be $ϵ$ and $A = Q$ .

Therefore, $α = β$ and $Q ⟶ β$ becomes $A ⟶ α$ .
(ii): Assume for contradiction that $L (G) \neq \emptyset$ .

$\exists w \in L (G) .$

$\exists k \in N \cup {0}, w_{1}, w_{2}, \dots \dots \dots w_{k} \in {(V \cup Σ)}^{*}$ such that $S ⟹ w_{1} ⟹ w_{2} \dots \dots \dots ⟹ w_{k} ⟹ w$ or $S ⟹ w$ . (Note that $k = 0 ⟹ S = w_{k}$ .)

By (i), $S ⟶ w_{1}$ or $S ⟶ w$ .

This contradicts the assumption that there is no $α$ in ${(V \cup Σ)}^{*}$ such that $S ⟶ α$ is a rule.
(iii): Since $x ⟹ y$ , $\exists B \in V, w_{1}, w_{2}, β \in {(V \cup Σ)}^{*}$ such that $x = w_{1} B w_{2}$ , $y = w_{1} β w_{2}$ and $B ⟶ β$ is a rule.

Since $A ⟶ α$ is not a rule $\forall α \in {(V \cup Σ)}^{*}, A \neq B$ .

Since $A$ appears in $x$ , $A$ appears either in $w_{1} o r w_{2}$ .

In either case, $A$ appears in $y$ .
(iv): Assume for contradiction $\exists A$ which appears in $u_{i}$ for some $i \in {1,2, 3 \dots n}$ such that $A ⟶ α$ is not a rule for all $α \in {(V \cup Σ)}^{*}$ .

By (iii), $A$ appears in $u_{i + 1}$ .

By repeated application of (iii), we can conclude that

A

appears in

u_{i + 2}, u_{i + 3}, \dots u_{n}

and

w

, which is a contradiction because

w

contains no variables.

Example 2.9.

Let

G = ({S}, {0,1}, R, S)

be a

C F G

.

Create the rules in

R

so that

L (G) = \{0^{n} 1^{2 n + 1}| n \in N}

.

The rule is

S ⟶ 0 S 11 | 1

as can be seen from the following applications of the rule.

S ⟶ 0 S 11

(1^st application of

S ⟶ 0 S 11

)

⟶ 00 S 1111

(2^nd application of

S ⟶ 0 S 11

)

⟶ 000 S 111111

(3^rd application of

S ⟶ 0 S 11

)

⋮

⋮

⟶ 0^{n} S 1^{2 n}

(

n^{t h}

application of

S ⟶ 0 S 11

)

⟶ 0^{n} 1^{2 n + 1}

(Application of

S ⟶ 1

)

Example 2.10.

Let

G = ({S}, {0,1}, R, S)

be a

C F G

.

Create the rules in

R

so that

L (G) = \{0^{2 n} 1^{3 n}| n \in N}

.

The rule is

S ⟶ 00 S 111 | ϵ

as can be seen from the following applications of the rule.

S ⟶ 00 S 111

(1^st application of

S ⟶ 00 S 111

)

⟶ 0000 S 111111

(2^nd application of

S ⟶ 00 S 111

)

⟶ 000000 S 111111111

(3^rd application of

S ⟶ 00 S 111

)

⋮

⋮

⟶ 0^{2 n} S 1^{3 n}

(

n^{t h}

application of

S ⟶ 00 S 111

)

⟶ 0^{2 n} {ϵ 1}^{3 n}

(Application of

S ⟶ ϵ

)

⟶ 0^{2 n} 1^{3 n}

Example 2.11.

Let

G = ({S}, {0,1}, R, S)

be a

C F G

.

Create the rules in

R

so that

L (G) = \{0^{2 n + 7} 1^{3 n + 9}| n \in N}

.

The rule is

S ⟶ 00 S 111 | 0^{7} 1^{9}

as can be seen from the following applications of the rule.

S ⟶ 00 S 111

(1^st application of

S ⟶ 00 S 111

)

⟶ 0000 S 111111

(2^nd application of

S ⟶ 00 S 111

)

⟶ 000000 S 111111111

(3^rd application of

S ⟶ 00 S 111

)

⋮

⋮

⟶ 0^{2 n} S 1^{3 n}

(

n^{t h}

application of

S ⟶ 00 S 111

)

⟶ 0^{2 n} 0^{7} 1^{9} 1^{3 n}

(Application of

S ⟶ 0^{7} 1^{9}

)

⟶ 0^{2 n + 7} 1^{3 n + 9}

Definition 2.12. Let

G = (V, Σ, R, S)

be a

C F G

.

Let

R_{1}, R_{2}, R_{3}, \dots \dots R_{n} a n d Q

be rules in

R

where

n \geq 1

.

(R_{1}, R_{2}, R_{3}, \dots \dots R_{n})

and

Q

are equivalent if

\forall u \in {(V \cup Σ)}^{*},

there exists

v \in {(V \cup Σ)}^{*}

such that

u \overset{n, (R_{1}, R_{2}, \dots R_{n})}{\Rightarrow} v

if and only if

u \overset{Q}{\Rightarrow} v

Proposition 2.13.

(i): $A \overset{A \to α}{\Rightarrow} α$ if and only if $A \to α$ is a rule.
(ii): If $A$ does not appear in $α$ and $A$ does not appear in $β$ and $A \to α$ is a rule, then

α A β \overset{A \to γ}{\Rightarrow} x i f f x = α γ β

.

Proof.

(i): If $A \to α$ is a rule, $ϵ A ϵ \overset{A \to α}{\Rightarrow} ϵ α ϵ$ and therefore, $A \overset{A \to α}{\Rightarrow} α$ .
(ii): If $x = α γ β$ , since $A \to γ$ , by definition $α A β \overset{A \to γ}{\Rightarrow} α γ β$ . Therefore, $α A β \overset{A \to γ}{\Rightarrow} x$ .

Conversely, if

A \overset{A \to α}{\Rightarrow} α

, by definition

A \to α

is a rule.

Conversely, if

α A β \overset{A \to γ}{\Rightarrow} x

,

\exists u_{1}, u_{2} \in {(V \cup Σ)}^{*}

such that

α A β = u_{1} A u_{2}

and

x = u_{1} γ u_{2}

.

Since

A

does not appear in

α

and

A

does not appear in

β

, there is only one appearance of

A

in

α A β

.

Therefore, there is only one appearance of

A

in

u_{1} A u_{2}

.

Therefore,

(α A β = u_{1} A u_{2}) ⟹ (α = u_{1} & β = u_{2})

.

Therefore,

x = α γ β

.

Proposition 2.14.

(A ⟶ α)

&

(B ⟶ β)

are equivalent if and only if

(A = B)

&

(α = β)

.

Proof.

If

(A ⟶ α)

&

(B ⟶ β)

are equivalent,

(A ⟶ α) ⟹ (A \overset{A \to α}{\Rightarrow} α)

(Proposition 2.13)

Since

(A ⟶ α)

&

(B ⟶ β)

are equivalent,

A \overset{B \to β}{\Rightarrow} α

.

There exist

w_{1}, w_{2} \in {(V \cup Σ)}^{*}

such that

A = w_{1} B w_{2}

and

α = w_{1} β w_{2}

.

A = w_{1} B w_{2} ⟹ A = B

since

A

and

B

are both variables.

A = w_{1} A w_{2} ⟹ w_{1} = w_{2} = ϵ

.

Therefore,

α = β

.

Conversely, if

(A = B)

&

(α = β)

,

(A ⟶ α)

&

(B ⟶ β)

are the same rule and hence they are equivalent.

Proposition 2.15.

Let

G = (V, Σ, R, S)

be a

C F G

.

\forall A, B \in V

and

x, y, z \in {(V \cup Σ)}^{*}

,

(A ⟶ x B z) & (B ⟶ y)

are equivalent to

A ⟶ x y z

.

Proof.

\forall u \in {(V \cup Σ)}^{*}

, let

R_{1}

be

A ⟶ x B z

R_{2}

be

B ⟶ y

R_{3}

be

A ⟶ x y z

If

u \overset{2, (R_{1}, R_{2})}{\Rightarrow} v

,

\exists w_{1} \in {(V \cup Σ)}^{*}

such that

u \overset{R_{1}}{\Rightarrow} w_{1} \overset{R_{2}}{\Rightarrow} v

Since

u \overset{R_{1}}{\Rightarrow} w_{1}

,

u = α_{1} A α_{2}

and

w_{1} = α_{1} x B z α_{2}

for some

α_{1}, α_{2} \in {(V \cup Σ)}^{*}

.

Since

R_{2} = (B ⟶ y)

,

α_{1} x B z α_{2} \overset{R_{2}}{\Rightarrow} α_{1} x y z α_{2}

.

That is,

w_{1} \overset{R_{2}}{\Rightarrow} α_{1} x y z α_{2}

.

Therefore,

\exists v = α_{1} x y z α_{2}

such that

u \overset{2, (R_{1}, R_{2})}{\Rightarrow} v

.

Since

R_{3} = (A ⟶ x y z)

,

α_{1} A α_{2} \overset{R_{3}}{\Rightarrow} α_{1} x y z α_{2}

.

That is,

u \overset{R_{3}}{\Rightarrow} α_{1} x y z α_{2}

.

Therefore,

u \overset{R_{3}}{\Rightarrow} v

.

Conversely, if

u \overset{R_{3}}{\Rightarrow} v

,

u = α_{1} A α_{2}

and

v = α_{1} x y z α_{2}

for some

α_{1}, α_{2} \in {(V \cup Σ)}^{*}

.

Let

w_{1} = α_{1} x B z α_{2}

.

Since

R_{1} = (A ⟶ x B z)

,

α_{1} A α_{2} \overset{R_{1}}{\Rightarrow} α_{1} x B z α_{2}

.

Since

R_{2} = (B ⟶ y)

,

α_{1} x B z α_{2} \overset{R_{2}}{\Rightarrow} α_{1} x y z α_{2}

.

Therefore,

α_{1} A α_{2} \overset{R_{1}}{\Rightarrow} α_{1} x B z α_{2} \overset{R_{2}}{\Rightarrow} α_{1} x y z α_{2}

.

That is,

u \overset{R_{1}}{\Rightarrow} w_{1} \overset{R_{2}}{\Rightarrow} v

.

That is,

u \overset{2, (R_{1}, R_{2})}{\Rightarrow} v

Combining both directions,

(R_{1}, R_{2})

and

R_{3}

are equivalent.

Proposition 2.16. Let

G = (V, Σ, R, S)

be a

C F G

.

(a): $\forall A, B \in V$ , $x, y, z \in {(V \cup Σ)}^{*}$ , if $A ⟹ x B z & B ⟹ y$ then $A \overset{*}{\Rightarrow} x y z$ .
(b): $\forall α, β, γ, β^{'} \in {(V \cup Σ)}^{*}$ , if $β ⟹ β^{'}$ then $α β γ ⟹ α β^{'} γ$ .
(c): $\forall α, β, γ, β^{'} \in {(V \cup Σ)}^{*}$ , if $β \overset{*}{\Rightarrow} β^{'}$ then $α β γ \overset{*}{\Rightarrow} α β^{'} γ$ .
(d): Let $α_{1}, α_{2}, \dots α_{n}, β_{1}, β_{2}, \dots β_{n}, γ_{1}, γ_{2}, \dots γ_{n} \in {(V \cup Σ)}^{*}$ .

If

β_{i} \overset{*}{\Rightarrow} γ_{i}

for

i \in {1,2, 3, \dots n}

, then

α_{1} β_{1} α_{2} β_{2} α_{3} β_{3} \dots α_{n} β_{n} \overset{*}{\Rightarrow} α_{1} γ_{1} α_{2} γ_{2} α_{3} γ_{3} \dots α_{n} γ_{n}

.

In the special case of

α_{1} = α_{2}, \dots = α_{n} = ϵ

,

β_{1} β_{2} β_{3} \dots β_{n} \overset{*}{\Rightarrow} γ_{1} γ_{2} γ_{3} \dots γ_{n}

.

Proof.

(a)

By Proposition 2.8 (i),

(B ⟹ y) ⟺ (B ⟶ y)

.

By definition of derivation,

x B z ⟹ x y z

.

Therefore,

A ⟹ x B z ⟹ x y z

.

Therefore,

A \overset{*}{\Rightarrow} x y z

.

(b)

Since

β ⟹ β^{'}

,

\exists β_{1}, β_{2} \in {(V \cup Σ)}^{*}

,

A \in V

,

η \in {(V \cup Σ)}^{*}

and a rule

A ⟶ η

such that

β = β_{1} A β_{2}

,

β^{'} = β_{1} η β_{2}

.

Therefore,

α β γ = α β_{1} A β_{2} γ

and

α β^{'} γ = α β_{1} η β_{2} γ

.

Therefore,

α β γ ⟹ α β^{'} γ

.

(c)

Since

β \overset{*}{\Rightarrow} β^{'}

,

\exists u_{1}, u_{2} \dots u_{n} \in {(V \cup Σ)}^{*}

where

n \geq 0

such that

β ⟹ u_{1} ⟹ u_{2} ⟹ u_{3} \dots u_{n - 1} ⟹ u_{n} ⟹ β^{'}

.

α β γ ⟹ α u_{1} γ

(

β ⟹ u_{1}

& (b))

⟹ α u_{2} γ

(

u_{1} ⟹ u_{2}

& (b))

⋮

⟹ α u_{n} γ

(

u_{n - 1} ⟹ u_{n}

& (b))

⟹ α β^{'} γ

(

u_{n} ⟹ β^{'}

. & (b))

Therefore,

α β γ \overset{*}{\Rightarrow} α β^{'} γ

.

(d)

α_{1} β_{1} α_{2} β_{2} α_{3} β_{3} \dots α_{n} β_{n} \overset{*}{\Rightarrow} α_{1} γ_{1} α_{2} β_{2} α_{3} β_{3} \dots α_{n} β_{n}

(

β_{1} \overset{*}{\Rightarrow} γ_{1}

& (c))

\overset{*}{\Rightarrow} α_{1} γ_{1} α_{2} γ_{2} α_{3} β_{3} \dots α_{n} β_{n}

(

β_{2} \overset{*}{\Rightarrow} γ_{2}

& (c))

\overset{*}{\Rightarrow} α_{1} γ_{1} α_{2} γ_{2} α_{3} γ_{3} \dots α_{n} β_{n}

(

β_{3} \overset{*}{\Rightarrow} γ_{3}

& (c))

⋮

\overset{*}{\Rightarrow} α_{1} γ_{1} α_{2} γ_{2} α_{3} γ_{3} \dots α_{n} γ_{n}

(

β_{n} \overset{*}{\Rightarrow} γ_{n}

& (c))

Therefore,

α_{1} β_{1} α_{2} β_{2} α_{3} β_{3} \dots α_{n} β_{n} \overset{*}{\Rightarrow} α_{1} γ_{1} α_{2} γ_{2} α_{3} γ_{3} \dots α_{n} γ_{n}

.

This completes the proof of Proposition 2.16.

By replacing

⟹

with

\overset{l m}{\Rightarrow}

and

\overset{*}{\Rightarrow}

with

\overset{*, l m}{\Rightarrow}

, we have the following proposition.

Proposition 2.17.

Let

G = (V, Σ, R, S)

be a

C F G

.

(a): $\forall A, B \in V$ , $x, y, z \in {(V \cup Σ)}^{*}$ , if $A \overset{l m}{\Rightarrow} x B z & B \overset{l m}{\Rightarrow} y$ then $A \overset{*, l m}{\Rightarrow} x y z$ .
(b): $\forall α, β, γ, β^{'} \in {(V \cup Σ)}^{*}$ , if $β \overset{l m}{\Rightarrow} β^{'}$ then $α β γ \overset{l m}{\Rightarrow} α β^{'} γ$ .
(c): $\forall α, β, γ, β^{'} \in {(V \cup Σ)}^{*}$ , if $β \overset{*, l m}{\Rightarrow} β^{'}$ then $α β γ \overset{*, l m}{\Rightarrow} α β^{'} γ$ .
(d): Let $α_{1}, α_{2}, \dots α_{n}, β_{1}, β_{2}, \dots β_{n}, γ_{1}, γ_{2}, \dots γ_{n} \in {(V \cup Σ)}^{*}$ .

If

β_{i} \overset{*, l m}{\Rightarrow} γ_{i}

for

i \in {1,2, 3, \dots n}

, then

α_{1} β_{1} α_{2} β_{2} α_{3} β_{3} \dots α_{n} β_{n} \overset{*, l m}{\Rightarrow} α_{1} γ_{1} α_{2} γ_{2} α_{3} γ_{3} \dots α_{n} γ_{n}

.

In the special case of

α_{1} = α_{2}, \dots = α_{n} = ϵ

,

β_{1} β_{2} β_{3} \dots β_{n} \overset{*, l m}{\Rightarrow} γ_{1} γ_{2} γ_{3} \dots γ_{n}

.

Proposition 2.18.

Let

G = (V, Σ, R, S)

be a

C F G

.

A, A_{1}, A_{2}, \dots \dots A_{n} \in V

,

α_{1}, α_{2}, \dots α_{n}, β_{1}, β_{2}, \dots β_{n} \in {(V \cup Σ)}^{*}

.

If

A ⟶ α_{1} A_{1} β_{1}

,

A_{1} ⟶ α_{2} A_{2} β_{2}

,

\dots A_{n - 1} ⟶ α_{n} A_{n} β_{n}

, then

A \overset{*}{\Rightarrow} α_{1} \dots α_{n - 1} α_{n} {A_{n} β}_{n} β_{n - 1} \dots β_{1}

.

Proof.

The proof is by induction on

n

.

(

n = 1

)

If

A ⟶ α_{1} A_{1} β_{1}

, by Proposition 2.8 (i),

A {⟹ α}_{1} A_{1} β_{1}

.

Therefore,

A \overset{*}{\Rightarrow} α_{1} A_{1} β_{1}

.

(

n = k + 1, k \geq 1

)

Assume

A ⟶ α_{1} A_{1} β_{1}

,

A_{1} ⟶ α_{2} A_{2} β_{2}

,

\dots A_{k - 1} ⟶ α_{k} A_{k} β_{k}

,

A_{k} ⟶ α_{k + 1} A_{k + 1} β_{k + 1}

.

By induction hypothesis,

A \overset{*}{\Rightarrow} α_{1} \dots α_{k - 1} α_{k} {A_{k} β}_{k} β_{k - 1} \dots β_{1}

.

Since

A_{k} ⟶ α_{k + 1} A_{k + 1} β_{k + 1}

, by definition of derivation,

α_{1} \dots α_{k - 1} α_{k} {A_{k} β}_{k} β_{k - 1} \dots β_{1} ⟹ α_{1} \dots α_{k - 1} α_{k} {α_{k + 1} A_{k + 1} β_{k + 1} β}_{k} β_{k - 1} \dots β_{1}

.

Therefore,

A \overset{*}{\Rightarrow} α_{1} \dots α_{k - 1} α_{k} {A_{k} β}_{k} β_{k - 1} \dots β_{1} ⟹ α_{1} \dots α_{k - 1} α_{k} {α_{k + 1} A_{k + 1} β_{k + 1} β}_{k} β_{k - 1} \dots β_{1}

.

Therefore,

A \overset{*}{\Rightarrow} α_{1} \dots α_{k - 1} α_{k} {α_{k + 1} A_{k + 1} β_{k + 1} β}_{k} β_{k - 1} \dots β_{1}

.

This completes the proof of Proposition 2.18.

Proposition 2.19.

Let

G = (V, Σ, R, S)

be a

C F G

,

B \in V, R_{1}, R_{2}, R_{3}, \dots \dots R_{n}

be rules in

R

where

n \geq 0

.

Let

α_{1}, α_{2}, α_{1}^{'}, α_{2}^{'} \in {(V \cup Σ)}^{*}

.

If

α_{1} B α_{2} \overset{n, (R_{1}, R_{2}, \dots R_{n})}{\Rightarrow} α_{1}^{'} B α_{2}^{'}

then

α_{1} x α_{2} \overset{n, (R_{1}, R_{2}, \dots R_{n})}{\Rightarrow} α_{1}^{'} x α_{2}^{'} \forall x \in {(V \cup Σ)}^{*}

where the two

B ’ s

in the two strings are the same

B

(Note that there can be more than one

B

in the string

α_{1} B α_{2}

) and

B

is not the head of any rule

R_{i} (i = 1,2, \dots \dots n)

.

(Note that when

n = 0

, the statement becomes

{(α}_{1} B α_{2} = α_{1}^{'} B α_{2}^{'}) ⟹ {(α}_{1} x α_{2} = α_{1}^{'} x α_{2}^{'} \forall x \in {(V \cup Σ)}^{*}

)

Proof.

For

n = 0

,

α_{1} B α_{2} = α_{1}^{'} B α_{2}^{'}

Since the two

B ’ s

in the two strings are the same

B

, replacing them with

x

must yield two equal strings.

Therefore,

α_{1} x α_{2} = α_{1}^{'} x α_{2}^{'}

.

Therefore, the statement is true for

n = 0

.

For

n = 1

, if

α_{1} B α_{2} \overset{R_{1}}{\Rightarrow} α_{1}^{'} B α_{2}^{'}

,

Let

A ⟶ α

be the rule represented by

R_{1}

.

By definition of yielding,

\exists u_{1}, u_{2} \in {(V \cup Σ)}^{*}

such that

α_{1} B α_{2} = u_{1} A u_{2}

and

α_{1}^{'} B α_{2}^{'} = u_{1} α u_{2}

.

The

B

that appears in

α_{1} B α_{2}

must also appear in

u_{1} A u_{2}

.

Since

R_{1}

does not originate from this particular

B

,

A

and

B

cannot be the same object in the string

α_{1} B α_{2}

or

u_{1} A u_{2}

and hence there are only two cases to examine:

B

appears in

u_{1}

or

B

appears in

u_{2}

.

(i): If $B$ appears in $u_{1}$

Let $u_{1}^{'}$ be the string obtained by replacing $B$ in $u_{1}$ with $x$ .

Since $α_{1} B α_{2} = u_{1} A u_{2}$ , replacing $B$ with $x$ on both sides would yield two equal strings.

That is, $α_{1} x α_{2} = u_{1}^{'} A u_{2}$ .

Since $α_{1}^{'} B α_{2}^{'} = u_{1} α u_{2}$ , replacing $B$ with $x$ on both sides would yield two equal strings.

That is, $α_{1}^{'} x α_{2}^{'} = u_{1}^{'} α u_{2}$ .

However, $u_{1}^{'} A u_{2} \overset{R_{1}}{\Rightarrow} u_{1}^{'} α u_{2}$ since $A ⟶ α$ is a rule.

Therefore, $α_{1} x α_{2} \overset{R_{1}}{\Rightarrow} α_{1}^{'} x α_{2}^{'}$ .

Therefore, the statement is true for $n = 1$ .
(ii): If $B$ appears in $u_{2}$ , with a similar argument, we can show that the statement is also true for $n = 1$ .

With the results established on

n = 0

and

n = 1

and an induction argument, we can conclude that for

n \geq 0

,

(α_{1} B α_{2} \overset{n, (R_{1}, R_{2}, \dots R_{n})}{\Rightarrow} α_{1}^{'} B α_{2}^{'}) {⟹ (α}_{1} x α_{2} \overset{n, (R_{1}, R_{2}, \dots R_{n})}{\Rightarrow} α_{1}^{'} x α_{2}^{'} \forall x \in {(V \cup Σ)}^{*}

)

Proposition 2.20.

If

G = (V, Σ, R, S)

is a

C F G

and there exist

u_{1}, u_{2}, \dots u_{k} \in {(V \cup Σ)}^{*}

,

w \in Σ^{*}

such that

S ⟹ u_{1} ⟹ u_{2} ⟹ u_{3} \dots ⟹ u_{r} ⟹ \dots ⟹ u_{k} ⟹ w

, then

The # of variables in

u_{r} \leq

the # of steps remaining from

u_{r}

to

w

.

Proof. Let

n

be the number of steps remaining from

u_{r}

to

w

.

n = k + 1 - r

We’ll prove this proposition by induction on

n

.

(For

n = 1

)

r = k

Therefore,

u_{r} = u_{k} ⟹ w

.

Since

u_{k} ⟹ w

,

\exists α, β, γ \in {(V \cup Σ)}^{*}

and a rule

A ⟶ γ

such that

u_{k} = α A β

and

w = α γ β

.

Since

w \in Σ^{*}

,

α, β \in Σ^{*}

.

Therefore,

u_{k}

has only one variable.

Therefore,

u_{r}

has only one variable.

Therefore, # of variables in

u_{r} \leq

the # of steps remaining from

u_{r}

to

w

.

(For induction)

The # of steps remaining from

u_{r - 1}

to

w

is

n + 1 = k + 2 - r

.

Since

u_{r - 1} ⟹ u_{r}

,

\exists α, β, γ \in {(V \cup Σ)}^{*}

and a rule

A ⟶ γ

such that

u_{r - 1} = α A β

and

u_{r} = α γ β

.

Let

m

be the number of variables in

u_{r - 1}

.

# of variables in

u_{r} = m - 1 +

(# of variables in

γ

)

\geq m - 1

.

By induction hypothesis, # of variables in

u_{r} \leq n = k + 1 - r

.

Therefore,

m - 1 \leq

# of variables in

u_{r} \leq n = k + 1 - r

.

Therefore,

m - 1 \leq k + 1 - r

.

Therefore,

m \leq k + 2 - r

.

Therefore, number of variables in

u_{r - 1} \leq

# of steps remaining from

u_{r - 1}

to

w

.

This completes the proof of Proposition 2.20.

Example 2.21.

Let

G = (V, Σ, R, S)

be a

C F G

and there exist

u_{1}, u_{2}, \dots u_{k} \in {(V \cup Σ)}^{*}

,

w \in Σ^{*}

such that

S ⟹ u_{1} ⟹ u_{2} ⟹ u_{3} \dots ⟹ u_{r} ⟹ \dots ⟹ u_{k} ⟹ w

. Show that the statement

(# of variables in

u_{r} =

the # of steps remaining from

u_{r}

to

w

) is not always true.

(Hint: Consider

V = \{S, A, B, C\}, Σ = \{a, b, c\}, R = {S ⟶ A B, A ⟶ C, C ⟶ c, B ⟶ b}

and

S \overset{S ⟶ A B}{\Rightarrow} A B \overset{A ⟶ C}{\Rightarrow} C B \overset{C ⟶ c}{\Rightarrow} c B \overset{B ⟶ b}{\Rightarrow} c b

.)

Definition 2.22.

\forall α, β \in {(V \cup Σ)}^{*}, α

is a substring of

β

(written as

α ⊏ β

) if

\exists α^{'}, α^{''} \in {(V \cup Σ)}^{*}

such that

β = α^{'} α α^{''}

.

α^{'}

is called the left complement of

α

in

β

, written as

L C (α)

.

α^{''}

is called the right complement of

α

in

β

, written as

R C (α)

.

Proposition 2.23.

For any strings

α_{1}, α_{2}, u

such that

α_{1}, α_{2} ⊏ u

, if

α_{1} ⊏ α_{2}

, then

(i): $L C (α_{2}) ⊏ L C (α_{1})$ & $L C (α_{2}) ∙ r = L C (α_{1})$ for some string $r$
(ii): $R C (α_{2}) ⊏ R C (α_{1})$ & $l ∙ R C (α_{2}) = R C (α_{1})$ for some string $l$ .

Proof.

α_{1} ⊏ u ⟹ u = x_{1} α_{1} y_{1}

for some strings

x_{1}, y_{1}

.

α_{2} ⊏ u ⟹ u = x_{2} α_{2} y_{2}

for some strings

x_{2}, y_{2}

.

x_{1} = L C (α_{1})

;

y_{1} = R C (α_{1})

.

x_{2} = L C (α_{2})

;

y_{2} = R C (α_{2})

.

α_{1} ⊏ α_{2} ⟹ α_{2} = r α_{1} l

for some strings

r

&

l

.

Therefore,

u = x_{2} α_{2} y_{2} = x_{2} r α_{1} l y_{2}

.

Since

u = x_{1} α_{1} y_{1}

,

x_{1} α_{1} y_{1} = x_{2} r α_{1} l y_{2}

.

Therefore,

x_{1} = x_{2} r

and

y_{1} = l y_{2}

.

Therefore,

L C (α_{1}) = L C (α_{2}) ∙ r

and

R C (α_{1}) = l ∙ R C (α_{2})

.

Therefore,

L C (α_{2}) ⊏ L C (α_{1})

and

R C (α_{2}) ⊏ R C (α_{1})

.

This completes the proof of Proposition 2.23.

Definition 2.24.

For any strings

α_{1}, α_{2}, u

such that

α_{1}, α_{2} ⊏ u

,

α_{1}

is said to be on the left of

α_{2}

if there exist strings

x, y, z

such that

u = x α_{1} y α_{2} z

.

Proposition 2.25.

Let

G = (V, Σ, R, S)

be a

C F G

.

Let

u_{0} ⟹ u_{1} ⟹ u_{2} ⟹ u_{3} \dots \dots ⟹ u_{n}

, where

u_{0}, u_{1}, u_{2}, u_{3} \dots u_{n} \in {(V \cup Σ)}^{*}

.

Let

0 \leq i < j \leq n

.

If

α_{i} ⊏ u_{i}

, then

\exists α_{i + 1}, α_{i + 2}, \dots α_{j}

where

α_{i + 1} ⊏ u_{i + 1}, α_{i + 2} ⊏ u_{i + 2}, \dots α_{j} ⊏ u_{j}

such that

α_{i} \overset{λ_{1}}{\Rightarrow} α_{i + 1} \overset{λ_{2}}{\Rightarrow} α_{i + 2} \overset{λ_{3}}{\Rightarrow} α_{i + 3} \dots \dots \overset{λ_{j - i}}{\Rightarrow} α_{j}

where

λ_{1}, λ_{2}, \dots \dots λ_{j - i} \in {0,1}

.

Hence,

α_{i} \overset{*}{\Rightarrow} α_{j}

in no more than

j - i

steps.

α_{j}

is called the

(j - i)

-step expansion of

α_{i}

within the derivation of

u_{0} \overset{*}{\Rightarrow} u_{n}

and it is written as

α_{j} = E x p a n (α_{i}, j - i)

.

Proof. Let

k = j - i

.

1 \leq k \leq n

This proposition can be proved by induction on

k

.

(

k = 1

):

j = i + 1

Since

u_{i} ⟹ u_{i + 1}

,

\exists α, β, γ \in {(V \cup Σ)}^{*}

and a rule

A ⟶ γ

such that

u_{i} = α A β

and

u_{i + 1} = α γ β

.

Since

α_{i} ⊏ u_{i}

,

\exists α^{'}, β^{'} \in {(V \cup Σ)}^{*}

such that

u_{i} = α^{'} α_{i} β^{'}

.

(i): If $A ⊏ α_{i}$

$\exists α^{''}, β^{''} \in {(V \cup Σ)}^{*}$ such that $α_{i} = α^{''} A β^{''}$ .

$u_{i} = α^{'} α_{i} β^{'} = α^{'} α^{''} A β^{''} β^{'}$ .

Also, $u_{i} = α A β$ .

Therefore, $α A β = α^{'} α^{''} A β^{''} β^{'}$ .

Therefore, $α = α^{'} α^{''}$ and $β = β^{''} β^{'}$ .

Since $u_{i + 1} = α γ β$ , $u_{i + 1} = α^{'} α^{''} γ β^{''} β^{'}$ .

Take $α_{i + 1} = α^{''} γ β^{''}$ .

Since $α_{i} = α^{''} A β^{''}$ and $A ⟶ γ$ is a rule, $α_{i} ⟹ α_{i + 1}$ .
(ii): If $A$ is not a substring of $α_{i}$ Since $u_{i} = α A β$ and $α_{i} ⊏ u_{i}$ , either $α_{i} ⊏ α$ or $α_{i} ⊏ β$ .

Since

u_{i + 1} = α γ β

,

α ⊏ u_{i + 1}

and

β ⊏ u_{i + 1}

.

Therefore (

α_{i} ⊏ α

or

α_{i} ⊏ β

)

⟹ α_{i} ⊏ u_{i + 1}

.

Take

α_{i + 1} = α_{i}

.

Therefore,

α_{i} \overset{0}{\Rightarrow} α_{i + 1}

.

Combining (i) and (ii),

α_{i} \overset{λ_{1}}{\Rightarrow} α_{i + 1}

where

λ_{1} \in {0,1}

.

(Induction):

By induction assumption,

α_{i} \overset{λ_{1}}{\Rightarrow} α_{i + 1} \overset{λ_{2}}{\Rightarrow} α_{i + 2} \overset{λ_{3}}{\Rightarrow} α_{i + 3} \dots \dots \overset{λ_{j - i}}{\Rightarrow} α_{j}

where

λ_{1}, λ_{2}, \dots \dots λ_{j - i} \in \{0,1\}

and

α_{i + 1} ⊏ u_{i + 1}, α_{i + 2} ⊏ u_{i + 2}, \dots α_{j} ⊏ u_{j}

.

Since

α_{j} ⊏ u_{j}

and

u_{j} ⟹ u_{j + 1}

, by applying the same argument as in the case of (

k = 1

), we can find

α_{j + 1} ⊏ u_{j + 1}

such that

α_{j} \overset{λ_{j - i + 1}}{\Rightarrow} α_{j + 1}

where

λ_{j - i + 1} \in \{0,1\}

.

We now have

α_{i} \overset{λ_{1}}{\Rightarrow} α_{i + 1} \overset{λ_{2}}{\Rightarrow} α_{i + 2} \overset{λ_{3}}{\Rightarrow} α_{i + 3} \dots \dots \overset{λ_{j - i}}{\Rightarrow} α_{j} \overset{λ_{j - i + 1}}{\Rightarrow} α_{j + 1}

where

λ_{1}, λ_{2}, \dots \dots λ_{j - i}, λ_{j - i + 1} \in \{0,1\}

and

α_{i + 1} ⊏ u_{i + 1}, α_{i + 2} ⊏ u_{i + 2}, \dots α_{j} ⊏ u_{j}, α_{j + 1} ⊏ u_{j + 1}

.

This completes the proof of Proposition 2.25.

Proposition 2.26.

Let

G = (V, Σ, R, S)

be a

C F G

.

Let

u_{0} ⟹ u_{1} ⟹ u_{2} ⟹ u_{3} \dots \dots ⟹ u_{n}

, where

u_{0}, u_{1}, u_{2}, u_{3} \dots u_{n} \in {(V \cup Σ)}^{*}

.

Let

0 \leq i < j \leq n

,

α_{i} ⊏ u_{i}

,

α_{i}^{'} ⊏ u_{i}

,

α_{j} = E x p a n (α_{i}, j - i)

&

α_{j}^{'} = E x p a n (α_{i}^{'}, j - i)

.

If

α_{i}

is to the left of

α_{i}^{'}

within

u_{i}

, then

α_{j}

is to the left of

α_{j}^{'}

within

u_{j}

.

Proof.

Let

k = j - 1

.

0 \leq k \leq n

.

We can prove this proposition by induction on

k

.

(

k = 0

)

j = i ⟹ α_{i} = α_{j}

&

α_{i}^{'} = α_{j}^{'}

.

Therefore,

α_{i} \overset{0}{\Rightarrow} α_{j}

&

α_{i}^{'} \overset{0}{\Rightarrow} α_{j}^{'}

.

α_{j} = E x p a n (α_{i}, 0)

and

α_{j}^{'} = E x p a n (α_{i}^{'}, 0)

.

α_{i}

is to the left of

α_{i}^{'} ⟹ α_{j}

is to the left of

α_{j}^{'}

.

The statement is true for

k = 0

.

(Induction)

Induction Hypothesis:

(α_{i}

is to the left of

α_{i}^{'}) ⟹ (α_{i + k}

is to the left of

α_{i + k}^{'})

.

Since

α_{i + k}

is to the left of

α_{i + k}^{'}

, there exist

x, y, z

\in {(V \cup Σ)}^{*}

such that

u_{i + k} = x α_{i + k} y α_{i + k}^{'} z

.

Since

u_{i + k} ⟹ u_{i + k + 1}

, there exists a rule

A ⟶ γ

such that

u_{i + k} = α A β

&

u_{i + k + 1} = α γ β

.

We now have five situations to examine:

A ⊏ x, A ⊏ α_{i + k}, A ⊏ y, A ⊏ α_{i + k}^{'}, A ⊏ z

.

(i): $A ⊏ x$

$R C (A) = l ∙ R C (x)$ (Proposition 2.23)

$β = l ∙ α_{i + k} y α_{i + k}^{'} z$ ( $R C (A) = β$ , $R C (x) = α_{i + k} y α_{i + k}^{'} z$ )

$u_{i + k + 1} = α γ β = α γ l α_{i + k} y α_{i + k}^{'} z$ .

Take $α_{i + k + 1} = α_{i + k}$ and $α_{i + k + 1}^{'} = α_{i + k}^{'}$ .

Therefore, $α_{i + k} \overset{0}{\Rightarrow} α_{i + k + 1}$ & $α_{i + k}^{'} \overset{0}{\Rightarrow} α_{i + k + 1}^{'}$ .

In addition, $u_{i + k + 1} = α γ l α_{i + k + 1} y α_{i + k + 1}^{'} z$ .

Therefore, $α_{i + k + 1}$ is to the left of $α_{i + k + 1}^{'}$ .
(ii): $A ⊏ α_{i + k}$

$\exists α^{'}, β^{'} \in {(V \cup Σ)}^{*}$ such that

$α_{i + k} = α^{'} A β^{'}$

$u_{i + k} = x α_{i + k} y α_{i + k}^{'} z = x α^{'} A β^{'} y α_{i + k}^{'} z$ Since $u_{i + k} = α A β$ , $α = x α^{'}$ & $β = β^{'} y α_{i + k}^{'} z$ .

Therefore, $u_{i + k + 1} = α γ β = x α^{'} γ β^{'} y α_{i + k}^{'} z$ .

Take $α_{i + k + 1} = α^{'} γ β^{'}$ & $α_{i + k + 1}^{'} = α_{i + k}^{'}$ .

Now, $u_{i + k + 1} = x α_{i + k + 1} y α_{i + k + 1}^{'} z$ .

So, $α_{i + k + 1}$ is to the left of $α_{i + k + 1}^{'}$ .

In addition, $α_{i + k} ⟹ α_{i + k + 1}$ because $α_{i + k} = α^{'} A β^{'}$ & $α_{i + k + 1} = α^{'} γ β^{'}$ .

Also, $α_{i + k}^{'} \overset{0}{\Rightarrow} α_{i + k + 1}^{'}$ because $α_{i + k + 1}^{'} = α_{i + k}^{'}$ .
(iii): $A ⊏ y$

With a similar argument as in (i), we can show that $\exists α_{i + k + 1}, α_{i + k + 1}^{'}$ in $u_{i + k + 1}$ such that $α_{i + k} \overset{λ}{\Rightarrow} α_{i + k + 1}$ & $α_{i + k}^{'} \overset{λ^{'}}{\Rightarrow} α_{i + k + 1}^{'}$ where $λ, λ^{'} \in {0,1}$ and

$α_{i + k + 1}$ is to the left of $α_{i + k + 1}^{'}$ .
(iv): $A ⊏ α_{i + k}^{'}$

With a similar argument as in (ii), we can show that $\exists α_{i + k + 1}, α_{i + k + 1}^{'}$ in $u_{i + k + 1}$ such that $α_{i + k} \overset{λ}{\Rightarrow} α_{i + k + 1}$ & $α_{i + k}^{'} \overset{λ^{'}}{\Rightarrow} α_{i + k + 1}^{'}$ where $λ, λ^{'} \in {0,1}$ and

$α_{i + k + 1}$ is to the left of $α_{i + k + 1}^{'}$ .
(v): $A ⊏ z$

With a similar argument as in (i), we can show that

\exists α_{i + k + 1}, α_{i + k + 1}^{'}

in

u_{i + k + 1}

such that

α_{i + k} \overset{λ}{\Rightarrow} α_{i + k + 1}

&

α_{i + k}^{'} \overset{λ^{'}}{\Rightarrow} α_{i + k + 1}^{'}

where

λ, λ^{'} \in {0,1}

and

α_{i + k + 1}

is to the left of

α_{i + k + 1}^{'}

.

Combining all (i) to (v) and the induction hypothesis, we now have:

If

α_{i}

is to the left of

α_{i}^{'}

within

u_{i}

, then

α_{i + k + 1}

is to the left of

α_{i + k + 1}^{'}

within

u_{i + k + 1}

.

This completes the proof of Proposition 2.26.

Proposition 2.27.

Let

G = (V, Σ, R, S)

be a

C F G

.

Let

u_{0} ⟹ u_{1} ⟹ \dots ⟹ u_{i} ⟹ u_{i + 1} \dots ⟹ u_{n}

, where

u_{0}, u_{1}, \dots u_{i}, u_{i + 1} \dots u_{n} \in {(V \cup Σ)}^{*}

&

0 \leq i < n

.

Let

α_{i}, β_{i} \in {(V \cup Σ)}^{*}

.

If

α_{i} β_{i} ⊏ u_{i}

, then

E x p a n (α_{i} β_{i}, 1) = E x p a n (α_{i}, 1) E x p a n (β_{i}, 1)

.

Proof.

Since

α_{i} β_{i} ⊏ u_{i},

u_{i} = x α_{i} β_{i} y

for some

x, y \in {(V \cup Σ)}^{*}

.

Since

u_{i} ⟹ u_{i + 1}

,

\exists α, β, γ \in {(V \cup Σ)}^{*}

and a rule

A ⟶ γ

such that

u_{i} = α A β

&

u_{i + 1} = α γ β

.

Since

A ⊏ u_{i}

&

u_{i} = x α_{i} β_{i} y

, we have four cases to examine:

A ⊏ x

,

A ⊏ α_{i}

,

A ⊏ β_{i}

,

A ⊏ y

.

(i): $A ⊏ x$

$x = x^{'} A y^{'}$ for some $x^{'}, y^{'} \in {(V \cup Σ)}^{*}$ .

$u_{i} = x α_{i} β_{i} y = x^{'} A y^{'} α_{i} β_{i} y$ .

Since $u_{i}$ is also equal to $α A β$ , $α A β = x^{'} A y^{'} α_{i} β_{i} y$ .

Therefore, $α = x^{'}$ and $β = y^{'} α_{i} β_{i} y$ .

Since $u_{i + 1} = α γ β$ , $u_{i + 1} = x^{'} γ y^{'} α_{i} β_{i} y$ .

Now we have $α_{i}, β_{i}, α_{i} β_{i} ⊏ u_{i}$ & $α_{i}, β_{i}, α_{i} β_{i} ⊏ u_{i + 1}$ .

In addition, $α_{i} \overset{0}{\Rightarrow} α_{i}$ , $β_{i} \overset{0}{\Rightarrow} β_{i}$ , $α_{i} β_{i} \overset{0}{\Rightarrow} α_{i} β_{i}$ .

Therefore, $E x p a n (α_{i} β_{i}, 1) = α_{i} β_{i}$ , $E x p a n (α_{i}, 1) = α_{i}$ and $E x p a n (β_{i}, 1) = β_{i}$ .

Therefore, $E x p a n (α_{i} β_{i}, 1) = E x p a n (α_{i}, 1) E x p a n (β_{i}, 1)$ .
(ii): $A ⊏ α_{i}$

$α_{i} = x^{'} A y^{'}$ for some $x^{'}, y^{'} \in {(V \cup Σ)}^{*}$ .

Since $u_{i} = x α_{i} β_{i} y$ , $u_{i} = x x^{'} A y^{'} β_{i} y$ .

Since $u_{i} = α A β$ , $α A β = x x^{'} A y^{'} β_{i} y$ .

Therefore, $α = x x^{'}$ and $β = y^{'} β_{i} y$ .

Since $u_{i + 1} = α γ β$ , $u_{i + 1} = x x^{'} γ y^{'} β_{i} y$ .

Let $α_{i + 1} = x^{'} γ y^{'}$ . Then $u_{i + 1} = x α_{i + 1} β_{i} y$ .

Since $α_{i} = x^{'} A y^{'}$ and $A ⟶ γ$ is a rule, $α_{i} ⟹ α_{i + 1}$ .

Since $α_{i} ⊏ u_{i}$ and $α_{i + 1} ⊏ u_{i + 1}$ , $E x p a n (α_{i}, 1) = α_{i + 1}$ .

Since $β_{i} ⊏ u_{i}$ and $β_{i} ⊏ u_{i + 1}$ and $β_{i} \overset{0}{\Rightarrow} β_{i}$ , $E x p a n (β_{i}, 1) = β_{i}$ .

Since $α_{i} = x^{'} A y^{'}$ , $α_{i} β_{i} = x^{'} A y^{'} β_{i}$ .

$x^{'} A y^{'} β_{i} ⟹ x^{'} γ y^{'} β_{i}$ because $A ⟶ γ$ is a rule.

Therefore, $α_{i} β_{i} ⟹ x^{'} γ y^{'} β_{i}$ .

Therefore, $α_{i} β_{i} ⟹ α_{i + 1} β_{i}$ ( $α_{i + 1} = x^{'} γ y^{'}$ )

Since $α_{i} β_{i} ⊏ u_{i}$ and $α_{i + 1} β_{i} ⊏ u_{i + 1}$ , $E x p a n (α_{i} β_{i}, 1) = α_{i + 1} β_{i}$ .

Therefore, $E x p a n (α_{i} β_{i}, 1) = E x p a n (α_{i}, 1) E x p a n (β_{i}, 1)$ .
(iii): $A ⊏ β_{i}$

With a similar argument as in (ii), we can show that $E x p a n (α_{i} β_{i}, 1) = E x p a n (α_{i}, 1) E x p a n (β_{i}, 1)$ .
(iv): $A ⊏ y$

With a similar argument as in (i), we can show that

$E x p a n (α_{i} β_{i}, 1) = E x p a n (α_{i}, 1) E x p a n (β_{i}, 1)$ .

This completes the proof of Proposition 2.27.

Proposition 2.28.

Let

G = (V, Σ, R, S)

be a

C F G

.

Let

u_{0} ⟹ u_{1} ⟹ \dots ⟹ u_{i} \dots ⟹ u_{n}

, where

u_{0}, u_{1}, \dots u_{i}, \dots u_{n} \in {(V \cup Σ)}^{*}

&

0 \leq i \leq n

.

(i): For $0 \leq k \leq n - i$ , $E x p a n (α_{i_{1}} α_{i_{2}} \dots α_{i_{m}}, k) = E x p a n (α_{i_{1}}, k) E x p a n (α_{i_{2}}, k) \dots E x p a n (α_{i_{m}}, k)$ where

$α_{i_{1}} α_{i_{2}} \dots α_{i_{m}} ⊏ u_{i}$ .
(ii): If $u_{0} = X_{1} X_{2} \dots X_{m}$ where $X_{1} {, X}_{2}, \dots X_{m} \in V \cup Σ$ & $u_{n} = w \in Σ^{*}$ , then $\exists w_{1}, w_{2} \dots w_{m} \in Σ^{*}$ such that $X_{i} \overset{*}{\Rightarrow} w_{i}$ in no more than $n$ steps &

$w = w_{1} w_{2} \dots w_{m}$ .

Proof.

Claim.

\forall α_{i}, β_{i} \in {(V \cup Σ)}^{*}

such that

α_{i} β_{i} ⊏ u_{i}

and

0 \leq k \leq n - i

,

E x p a n (α_{i} β_{i}, k) = E x p a n (α_{i}, k) E x p a n (β_{i}, k)

This Claim can be proved by induction on

k

.

(

k = 0

)

E x p a n (α_{i} β_{i}, 0) = α_{i} β_{i}

.

E x p a n (α_{i}, 0) = α_{i}

and

E x p a n (β_{i}, 0) = β_{i}

.

Therefore,

E x p a n (α_{i} β_{i}, 0) = E x p a n (α_{i}, 0) E x p a n (β_{i}, 0)

.

The statement is true for

k = 0

.

(Induction)

Induction Hypothesis:

E x p a n (α_{i} β_{i}, k) = E x p a n (α_{i}, k) E x p a n (β_{i}, k)

where

E x p a n (α_{i} β_{i}, k), E x p a n (α_{i}, k), E x p a n (β_{i}, k) ⊏ u_{i + k}

.

E x p a n (α_{i} β_{i}, k + 1) = E x p a n (E x p a n (α_{i} β_{i}, k), 1)

= E x p a n (E x p a n (α_{i}, k) E x p a n (β_{i}, k), 1)

(Induction Hypothesis)

= E x p a n (E x p a n (α_{i}, k), 1) E x p a n (E x p a n (β_{i}, k), 1)

(Proposition 2.27)

= E x p a n (α_{i}, k + 1) E x p a n (β_{i}, k + 1)

This completes the proof of Claim.

The proof of (i) is by induction on

m

.

(

m = 1

)

L H S = E x p a n (α_{i 1}, k)

.

R H S = E x p a n (α_{i 1}, k)

.

Therefore, the statement is true for

m = 1

.

(Induction)

Induction Hypothesis:

E x p a n (α_{i_{1}} α_{i_{2}} \dots α_{i_{m}}, k) = E x p a n (α_{i_{1}}, k) E x p a n (α_{i_{2}}, k) \dots E x p a n (α_{i_{m}}, k)

.

E x p a n (α_{i_{1}} α_{i_{2}} \dots α_{i_{m}} α_{i_{m + 1}}, k) = E x p a n (α_{i_{1}} α_{i_{2}} \dots α_{i_{m}}, k) E x p a n (α_{i_{m + 1}}, k)

(Claim)

= E x p a n (α_{i_{1}}, k) E x p a n (α_{i_{2}}, k) \dots E x p a n (α_{i_{m}}, k) E x p a n (α_{i_{m + 1}}, k)

(Induction Hypothesis)

This completes the proof of (i).

(i)

Set

i = 0

&

k = n

for the result in (i).

α_{0_{1}} = X_{1}

,

α_{0_{2}} = X_{2}

,

\dots α_{0_{m}} = X_{m}

.

u_{0} = X_{1} X_{2} \dots X_{m} = α_{0_{1}} α_{0_{2}} \dots α_{0_{m}}

Therefore,

α_{0_{1}} α_{0_{2}} \dots α_{0_{m}} ⊏ u_{0}

.

By (i),

E x p a n (α_{0_{1}} α_{0_{2}} \dots α_{0_{m}}, n) = E x p a n (α_{0_{1}}, n) E x p a n (α_{0_{2}}, n) \dots E x p a n (α_{0_{m}}, n)

Therefore,

E x p a n (X_{1} X_{2} \dots X_{m}, n) = E x p a n (X_{1}, n) E x p a n (X_{2}, n) \dots E x p a n (X_{m}, n)

.

E x p a n (X_{1} X_{2} \dots X_{m}, n) = E x p a n (u_{0}, n) = u_{n} = w

.

Therefore,

E x p a n (X_{1}, n) E x p a n (X_{2}, n) \dots E x p a n (X_{m}, n) = w

.

Therefore,

E x p a n (X_{i}, n) = w_{i}

for some

w_{i} \in Σ^{*}

,

i \in {1,2, \dots n}

.

Therefore,

w = w_{1} w_{2} \dots w_{m}

&

X_{i} \overset{*}{\Rightarrow} w_{i}

in no more than

n

steps.

Proposition 2.29.

Let

G = (V, Σ, R, S)

be a

C F G

,

α, β \in {(V \cup Σ)}^{*}

,

X \in V

and

w \in Σ^{*}

.

If

α X β \overset{*}{\Rightarrow} w

, then

X \overset{*}{\Rightarrow} w^{'}

for some

w^{'} \in Σ^{*}

.

Proof.

\exists n \geq 1

such that

α X β \overset{n}{\Rightarrow} w

.

E x p a n (α X β, n) = w

.

E x p a n (α, n) E x p a n (X, n) E x p a n (β, n) = w

(Proposition 2.28)

E x p a n (X, n) = w^{'}

for some

w^{'} \in Σ^{*}

X \overset{*}{\Rightarrow} w^{'}

for some

w^{'} \in Σ^{*}

(Proposition 2.25)

This completes the proof of Proposition 2.29.

Example 2.30. Prove that the non-regular set

A = {a^{n} b^{n} | n \geq 0}

is a

C F L

.

Proof.

Let

G = (V, Σ, R, S)

be a

C F G

such that

V = {S}

,

Σ = {a, b}

,

R = {S ⟶ a S b, S ⟶ ϵ}

.

In short form,

S ⟶ a S b | ϵ

.

Claim 1.

If

S \overset{n + 1}{\Rightarrow} α \forall n \geq 0

where

α \in {(V \cup Σ)}^{*}

, then

\exists γ \in {(V \cup Σ)}^{*}

such that

S \overset{n}{\Rightarrow} γ

and

γ \overset{1}{\Rightarrow} α

and

γ = a^{n} S b^{n}

.

Claim 1 can be proved by induction on

n

.

For

n = 0

, if

S \overset{1}{\Rightarrow} α

, by definition,

\exists γ \in {(V \cup Σ)}^{*}

such that

S \overset{0}{\Rightarrow} γ

and

γ \overset{1}{\Rightarrow} α

.

Therefore,

S = γ

.

Therefore,

γ = a^{0} S b^{0}

.

Therefore, the statement is true for

n = 0

.

Assume the statement is true for

n = k

for

k \geq 0

.

That is,

(S \overset{k + 1}{\Rightarrow} α) ⟹ (\exists γ \in {(V \cup Σ)}^{*} s u c h t h a t S \overset{k}{\Rightarrow} γ & γ \overset{1}{\Rightarrow} α & γ = a^{k} S b^{k})

for

k \geq 0

and

α \in {(V \cup Σ)}^{*}

.

For

n = k + 1

, assume

S \overset{k + 2}{\Rightarrow} α

.

By definition,

\exists γ^{'} \in {(V \cup Σ)}^{*}

such that

S \overset{k + 1}{\Rightarrow} γ^{'}

and

γ^{'} \overset{1}{\Rightarrow} α

.

By induction assumption,

\exists γ \in {(V \cup Σ)}^{*}

such that

S \overset{k}{\Rightarrow} γ

and

γ \overset{1}{\Rightarrow} γ^{'}

and

γ = a^{k} S b^{k}

.

Since there are only two rules in

R

, namely

S ⟶ a S b o r S ⟶ ϵ

.

If we use

S ⟶ ϵ

on

γ \overset{1}{\Rightarrow} γ^{'}

, then

a^{k} S b^{k} \overset{S ⟶ ϵ}{\Rightarrow} γ^{'}

.

By Proposition 2.13 (ii),

γ^{'} = a^{k} ϵ b^{k} = a^{k} b^{k}

.

This contradicts the conclusion

γ^{'} \overset{1}{\Rightarrow} α

we derive above because

a^{k} b^{k}

does not contain a variable.

Therefore, we must use rule

S ⟶ a S b

.

Therefore,

γ \overset{S ⟶ a S b}{\Rightarrow} γ^{'}

.

Therefore,

a^{k} S b^{k} \overset{S ⟶ a S b}{\Rightarrow} γ^{'}

.

Again by Proposition 2.13 (ii),

γ^{'} = a^{k} a S b b^{k} = a^{k + 1} S b^{k + 1}

.

This completes the proof of Claim 1.

Claim 2.

S \overset{n + 1}{\Rightarrow} a^{n} b^{n} \forall n \geq 0

.

For

n = 0

,

S \overset{S ⟶ ϵ}{\Rightarrow} ϵ

by Proposition 2.13 (i).

Therefore,

S \overset{1}{\Rightarrow} a^{0} b^{0}

and hence the statement is true for

n = 0

.

For

n \geq 1

, by Proposition 2.13 (i) & (ii),

S \overset{S ⟶ a S b}{\Rightarrow} a S b \overset{S ⟶ a S b}{\Rightarrow} a^{2} S b^{2} \overset{S ⟶ a S b}{\Rightarrow} a^{3} S b^{3} \dots \dots \overset{S ⟶ a S b}{\Rightarrow} a^{n} S b^{n}

.

Therefore,

S \overset{n}{\Rightarrow} a^{n} S b^{n}

.

In addition,

a^{n} S b^{n} \overset{S ⟶ ϵ}{\Rightarrow} a^{n} b^{n}

by Proposition 2.13 (ii).

Therefore,

S \overset{n + 1}{\Rightarrow} a^{n} b^{n}

.

This completes the proof of Claim 2.

It remains to show that

L (G) = A

.

u \in A ⟹ u = a^{n} b^{n}

⟹ S \overset{n + 1}{\Rightarrow} u

(by Claim 2)

⟹ u \in L (G)

Conversely, if

u \in L (G)

,

u \in Σ^{*}

and

S \overset{n + 1}{\Rightarrow} u

for some

n \geq 0

.

\exists γ \in {(V \cup Σ)}^{*}

such that

S \overset{n}{\Rightarrow} γ

and

γ \overset{1}{\Rightarrow} u

and

γ = a^{n} S b^{n}

by Claim 1.

Since there are only two rules in

R

, either

γ \overset{S ⟶ a S b}{\Rightarrow} u

or

γ \overset{S ⟶ ϵ}{\Rightarrow} u

.

(γ \overset{S ⟶ a S b}{\Rightarrow} u)

⟹ (a^{n} S b^{n} \overset{S ⟶ a S b}{\Rightarrow} u)

⟹ a^{n + 1} S b^{n + 1} = u

(Proposition 2.13 (ii))

⟹

a contradiction to

u \in Σ^{*}

.

Therefore, we must use

γ \overset{S ⟶ ϵ}{\Rightarrow} u

.

Therefore,

a^{n} S b^{n} \overset{S ⟶ ϵ}{\Rightarrow} u

.

a^{n} ϵ b^{n} = u

by Proposition 2.13 (ii).

Therefore,

u = a^{n} b^{n}

and hence

u \in A

.

Combining both directions,

L (G) = A

.

Before proceeding to the proof of some important theorems in

C F G

, we need to review some Tree terminology and Graph Theory. The readers are assumed to have some background in the subject matter and the following are stated without proof.

T1. A tree is a directed acyclic graph (DAG).

T2. Trees are collections of nodes and edges.

T3. If

(A, B)

is the directed edge from node

A

to node

B

,

A

is called the parent and

B

is called the child.

T4. A node has at most one parent, drawn above the node and zero or more children, drawn below.

T5. There is one node that has no parent. This node is called the root and appears at the top of the tree. Nodes that have no children are called leaves. Nodes that are not leaves are called interior nodes.

T6. A simple directed path from

v_{0}

to

v_{n}

is represented by

(v_{0}, v_{1}, v_{2}, \dots v_{n})

where

(v_{i}, v_{i + 1})

with

i \in {0,1, 2, \dots n - 1}

are directed edges joining the nodes,

v_{0}, v_{1}, v_{2}, \dots v_{n}

of the tree and

v_{i} \neq v_{j}

for

i \neq j

. The length of the simple directed path is equal to the number of directed edges connecting the nodes

v_{0}, v_{1}, v_{2}, \dots v_{n}

and is equal to

n

in this case.

T7. For any two nodes

A

and

B

, if there is a simple directed path from

A

to

B

,

B

is a descendant of

A

and

A

is the ancestor of

B

. Since every simple directed path from

A

to

B

must pass through a child of

A

, there is simple directed path from one of

A ’ s

children to

B

.

T8. There is a unique simple directed path from the root to any other node.

T9. Let

d (r, l) =

the length of the path from the root

r

to a leaf

l

. The height of the tree is defined as

h = M a x {d (r, l) | r = r o o t; l = a l e a f}

. Therefore, the height of a tree is the longest path from the root to a leaf.

T10. The length of the path from the root to a node

v

is called the level of

v

.

T11. The simple directed path from an interior node to a leaf is called a branch. The combination of all branches is the largest subtree with the interior node as the root. The length of any branch is no longer than the height of the subtree which in turn is no longer than the height of the parent tree.

T12. The children of a node are ordered from left to right. If node

A

is to the left of node

B

, then all the descendants of

A

are to be to the left of all the descendants of

B

at the same level.

T13. A subtree is a tree of which the vertices and edges are also the vertices and edges of the parent tree. If a subtree has a leaf, the leaf is also a leaf of the parent tree.

Definition 2.31.

For any context-free grammar,

G = (V, Σ, R, S)

, a parse tree for

G

is a tree that satisfies the following conditions:

(i): Each interior node is labeled as a variable in $V$ .
(ii): Each leaf is labeled either as a variable in $V$ , a terminal in $Σ$ or $ϵ$ .
(iii): If an interior node labeled $A$ (a variable) has children $X_{1}, X_{2}, X_{3}, \dots X_{n}$ where $X_{i} \in V \cup Σ$ for $i \in {1,2, \dots n}$ , then

$A ⟶ X_{1} X_{2} X_{3} \dots X_{n}$ is a rule in $R$ .
(iv): If an interior node labeled $A$ (a variable) has $ϵ$ as a child, then $ϵ$ is the only child of $A$ and $A ⟶ ϵ$ is a rule in $R$ .

Note that any subtree of a parse tree is also a parse tree.

Definition 2.32. The yield of a parse tree is the concatenation of all the leaves of the tree from left to right.

Theorem 2.33. Let

G = (V, Σ, R, S)

be a

C F G

. The following statements are equivalent.

(i): $\exists$ a parse tree with root $A \in V$ and a yield $w \in Σ^{*}$ .
(ii): $A \overset{*, l m}{\Rightarrow} w, w \in Σ^{*}$ .
(iii): $A \overset{*}{\Rightarrow} w$ , $w \in Σ^{*}$ .

Proof. “(i)

⟹

(ii)”

This can be proved by an induction on the height of the tree in statement (i).

Let

h (\geq 1)

be the height of the parse tree in statement (i).

“

h = 1

”

The parse tree looks like the following figure.

Figure 2.1. Caption.

By definition of parse tree,

A ⟶ X_{1} X_{2} X_{3} \dots X_{n}

is a rule in

R

.

By Proposition 2.8(i),

A ⟹ X_{1} X_{2} X_{3} \dots X_{n}

.

Therefore,

A \overset{*}{\Rightarrow} X_{1} X_{2} X_{3} \dots X_{n}

.

The yield of this tree is

X_{1} X_{2} X_{3} \dots X_{n}

which is equal to

w

by statement (i).

Therefore,

A \overset{*}{\Rightarrow} w

.

Since

A

is the only variable in the string

A

, it is therefore also the leftmost variable in the string

A

.

Therefore,

A \overset{*, l m}{\Rightarrow} w

.

Hence, the statement “(i)

⟹

(ii)” is true for

h = 1

.

“Induction”

Let

k

be an integer such that

k \geq 1

.

Induction Hypothesis:

The statement “(i)

⟹

(ii)” is true for any parse tree with height

h

if

h \leq k

.

Consider now a parse tree

P t (A, w, k + 1)

that has root

A

, yield

w

and a height of

k + 1

.

This parse tree looks like the following figure.

Figure 2.2. Caption.

\forall i \in {1,2, \dots n}

,

X_{i} \in V \cup Σ

.

There are 2 cases to examine.

(a): $X_{i} \in Σ$

$X_{i} = w_{i}$ for some $w_{i} \in Σ$ .

$X_{i} \overset{0}{\Rightarrow} w_{i}$ .

$X_{i} \overset{*}{\Rightarrow} w_{i}$ .

$X_{i} \overset{*, l m}{\Rightarrow} w_{i}$ ( $X_{i}$ is the only variable in the head)

Furthermore, since $X_{i} \in Σ$ , $X_{i} = w_{i}$ is a leaf.

Therefore, $w_{i} ⊏ w$ .
(b): $X_{i} \in V$

By T11 and T13, the combination of all branches of

X_{i}

forms a subtree of

P t (A, w, k + 1)

and every leaf of the subtree is also a leaf of the parent tree.

Let

w_{i}

be the yield of

X_{i}

.

By definition of yield, every symbol in

w_{i}

is a leaf and therefore a symbol in

w

.

Therefore,

w_{i} ⊏ w

.

Since

w \in Σ^{*}

,

w_{i} \in Σ^{*}

.

Claim:

w = w_{1} w_{2} \dots w_{n}

.

By T12,

w_{i}

is to the left of

w_{j}

for

i < j

since

X_{i}

is to the left of

X_{j}

.

Therefore,

w = x_{0} w_{1} x_{1} w_{2} \dots w_{n} x_{n}

where

x_{0}, x_{1}, \dots x_{n} \in Σ^{*}

.

Let

l

be a symbol in

w

.

l

is a leaf in

P t (A, w, k + 1)

because

w

is the yield.

By T8, there is a simple directed path from

A

to

l

.

By T7, there is a simple directed path from

X_{i}

to

l

for some

i \in {1,2, \dots n}

.

Since

l

has no children,

l

must be a leaf descendant of

X_{i}

.

Therefore,

l

is a symbol in

w_{i}

because

w_{i}

is the yield of the subtree with root

X_{i}

.

Therefore,

l

is a symbol in

w ⟹ l

is a symbol in

w_{i}

for some

i \in {1,2, \dots n}

.

Therefore,

|w| \leq |w_{1} w_{2} \dots w_{n}|

.

Therefore,

|x_{0} w_{1} x_{1} w_{2} \dots w_{n} x_{n}| \leq |w_{1} w_{2} \dots w_{n}|

.

This means that

x_{0} = x_{1} = \dots = x_{n} = ϵ

.

Therefore,

w = w_{1} w_{2} \dots w_{n}

.

Now, back to the subtree with root

X_{i}

and yield

w_{i}

.

The height of this subtree = the length of the longest branch in the subtree

= the length of a simple directed path in the parent tree

from

X_{i}

to a leaf

l

= the length of a simple directed path in the parent tree

from

A

to a leaf

l

minus

1

(By T7 &

X_{i}

is a child of

A

)

\leq

the height of the parent tree minus

1

=

k + 1 - 1

=

k

By induction hypothesis,

X_{i} \overset{*, l m}{\Rightarrow} w_{i}

.

Combining (a) & (b), we now have

X_{i} \overset{*, l m}{\Rightarrow} w_{i}

for all

i \in {1,2, \dots n}

and

w = w_{1} w_{2} \dots w_{n}

.

For the parent tree

P t (A, w, k + 1)

,

A ⟹ X_{1} X_{2} X_{3} \dots X_{n}

(Proposition 2.8(i))

A \overset{l m}{\Rightarrow} X_{1} X_{2} X_{3} \dots X_{n}

(

A

is the only variable in the head)

Since

X_{i} \overset{*, l m}{\Rightarrow} w_{i}

and by Proposition 2.17,

X_{1} X_{2} X_{3} \dots X_{n} \overset{*, l m}{\Rightarrow} w_{1} w_{2} \dots w_{n}

.

Therefore,

A \overset{l m}{\Rightarrow} X_{1} X_{2} X_{3} \dots X_{n} \overset{*, l m}{\Rightarrow} w_{1} w_{2} \dots w_{n}

.

A \overset{*, l m}{\Rightarrow} w_{1} w_{2} \dots w_{n}

.

Since

w = w_{1} w_{2} \dots w_{n}

,

A \overset{*, l m}{\Rightarrow} w

.

The statement “(i)

⟹

(ii)” is true for

h = k + 1

.

This completes the proof of “(i)

⟹

(ii)”.

“(ii)

⟹

(iii)”

The proof of this statement is trivial because every leftmost derivation is a derivation.

“(iii)

⟹

(i)”

Since

A \overset{*}{\Rightarrow} w

,

\exists n \geq 1

such that

A \overset{n}{\Rightarrow} w

. (Note that

n \neq 0

because

A \in V

and

w \in Σ^{*}

.)

The proof of this statement, “(iii)

⟹

(i)”, is by induction on

n

.

(

n = 1)

\exists w_{1}, w_{2} \dots w_{m} \in Σ

such that

w = w_{1} w_{2} \dots w_{m}

&

A ⟹ w_{1} w_{2} \dots w_{m}

.

By Proposition 2.8(i),

A ⟶ w_{1} w_{2} \dots w_{m}

.

The following is a parse tree with root

A

and yield

w

.

Figure 2.3. Caption.

Therefore, the statement is true for

n = 1

.

(Induction)

Induction Hypothesis:

Let

k

be an integer such that

k \geq 1

.

For any

n \leq k

, if

A \overset{n}{\Rightarrow} w

, then

\exists

a parse tree with root

A

and yield

w

.

Now, consider

n = k + 1

.

If

A \overset{k + 1}{\Rightarrow} w

,

\exists u_{1}, u_{2}, \dots u_{k} \in {(V \cup Σ)}^{*}

such that

A ⟹ u_{1} ⟹ u_{2} \dots \dots ⟹ u_{k} ⟹ w

.

\exists X_{1}, X_{2}, \dots X_{m} \in V \cup Σ

such that

u_{1} = X_{1} X_{2} \dots X_{m}

.

Therefore,

X_{1} X_{2} \dots X_{m} ⟹ u_{2} \dots \dots ⟹ u_{k} ⟹ w

.

By Proposition 2.28(ii),

X_{i} \overset{n_{i}}{\Rightarrow} w_{i}

with

n_{i} \leq k

and

w_{1} w_{2} \dots w_{m} = w

.

By induction hypothesis,

\exists

a parse tree with root

X_{i}

and yield

w_{i}

which looks like the following figure.

Figure 2.4. Caption.

We now can construct a parse tree,

P t (A, w, k + 1)

as follows.

(1): Start with a one level parse tree that has root $A$ and yield $X_{1} X_{2} \dots X_{m}$ that looks like the following figure.

Figure 2.5. Caption.

Figure 2.5. Caption.
(2): For each $i \in {1,2, \dots m}$ , if $X_{i} \in Σ$ , set $X_{i} = w_{i}$ for some $w_{i} \in Σ$ .

If

X_{i} \in V

, add the parse tree as shown in Figure 2.4 to the parse tree as shown in Figure 2.5. The resulting tree would look like the following figure.

P t (A, w, k + 1)

Figure 2.6. Caption.

Clearly, this tree (

P t (A, w, k + 1)

) with root

A

is a parse tree since the one level tree and all the subtrees with root

X_{i}

and yield

w_{i}

are parse trees.

In addition, since

w_{1} w_{2} \dots w_{m} = w

, the yield of this parse tree is

w

.

Therefore, the statement “(iii)

⟹

(i)” is true for

n = k + 1

.

This completes the proof of “(iii)

⟹

(i)” and also the proof of Theorem 2.33.

2.2. Chomsky Normal Form (CNF)

Definition 2.34.

Let

G = (V, Σ, R, S)

be a

C F G

.

G

is in Chomsky normal form if very rule of

G

is of the following form:

A ⟶ B C

where

A ϵ V

and

B, C ϵ V \ {S}

A ⟶ a

where

a ϵ Σ

S ⟶ ϵ

where

S =

Start Variable

Lemma 2.35.

For every

C F G G = (V, Σ, R, S)

, there is a

C F G G^{'}

with no

ϵ

-rule (

A ⟶ ϵ

where

A \neq S

) or unit rule (

A ⟶ B

where

A, B \in V

) such that

L (G) = L (G^{'})

.

Proof.

We can inductively construct a new set of rules,

R^{'}

using the following procedure:

(i): Copy all the rules in $R$ to $R^{'}$ .
(ii): If $B \neq S, A ⟶ α B β$ and $B ⟶ ϵ$ are in $R^{'}$ , create $A ⟶ α β$ in $R^{'}$ .
(iii): If $A ⟶ B$ and $B ⟶ γ$ are in $R^{'}$ , create $A ⟶ γ$ in $R^{'}$ .

We can further assume that

R^{'}

is the smallest one of all the sets that can be thus created because we can always rename the smallest one to

R^{'}

knowing that the minimum exists.

Let

G^{'} = (V, Σ, R^{'}, S)

.

It’s clear from construction that

R \subset R^{'}

.

Therefore, every derivation in

G

is a derivation in

G^{'}

and hence

L (G) \subset L (G^{'})

.

On the other hand, every new rule that is created in

G^{'}

is equivalent to the two rules that it is created from by Proposition 2.15 and therefore, every derivation in

G^{'}

can be simulated by either the same rules or equivalent rules in

G

.

Hence,

L (G^{'}) \subset L (G)

.

It remains to show that all the

ϵ

and unit rules in

G^{'}

are redundant for the production of any

x \in L (G^{'})

.

Since

L (G^{'}) = \{x {\in Σ}^{*}| S \overset{*, G^{'}}{\Rightarrow} x}

, knowing that minimum derivations exist, we can assume every derivation of

x \in L (G^{'})

is the one of minimum length.

Claim 1.

Any derivation

S \overset{*, G^{'}}{\Rightarrow} x

does not use an

ϵ

-rule.

Proof of Claim 1.

Assume for contradiction that

B ⟶ ϵ

where

B \neq S

is used at some point of the derivation.

S \overset{*, G^{'}}{\Rightarrow} x

can be rewritten as

S \overset{*, G^{'}}{\Rightarrow} γ B δ \overset{1, G^{'}}{\Rightarrow} γ δ \overset{*, G^{'}}{\Rightarrow} x

where

γ, δ \in {(V \cup Σ)}^{*}

.

This

B

must have been generated at an earlier point of the derivation in the form of

η A θ \overset{1, G^{'}}{\Rightarrow} η α B β θ

where

η, α, β, θ \in {(V \cup Σ)}^{*}

.

Therefore,

S \overset{*, G^{'}}{\Rightarrow} x

can be further rewritten as

S \overset{m, G^{'}}{\Rightarrow} η A θ \overset{1, G^{'}}{\Rightarrow} η α B β θ \overset{n, G^{'}}{\Rightarrow} γ B δ \overset{1, G^{'}}{\Rightarrow} γ δ \overset{k, G^{'}}{\Rightarrow} x

where

k, m, n \geq 0

.

(Note that

η α B β θ \overset{n, G^{'}}{\Rightarrow} γ B δ

is a derivation in which the rule in each step does not originate from this particular

B

.)

Since

A ⟶ α B β

and

B ⟶ ϵ

are in

R^{'}

, by construction (ii),

A ⟶ α β

is in

R^{'}

.

Therefore,

η A θ \overset{1, G^{'}}{\Rightarrow} η α β θ

is a valid production in

G^{'}

.

Furthermore, since

η α B β θ \overset{n, G^{'}}{\Rightarrow} γ B δ

, by Proposition 2.19, we can substitute

ϵ

for

B

to obtain the following valid production in

G^{'}

:

η α β θ \overset{n, G^{'}}{\Rightarrow} γ δ

.

If we apply these two new productions at the corresponding points of the original derivation of

x

, we have the following valid derivation:

S \overset{m, G^{'}}{\Rightarrow} η A θ \overset{1, G^{'}}{\Rightarrow} η α β θ \overset{n, G^{'}}{\Rightarrow} γ δ \overset{k, G^{'}}{\Rightarrow} x

.

We note that this new derivation of

x

has a length of

k + m + n + 1

which is shorter than the original one of

k + m + n + 2

.

This contradicts the assumption that the original derivation is of minimum length.

Claim 2. Any derivation

S \overset{*, G^{'}}{\Rightarrow} x

does not use a unit rule.

Proof of Claim 2. Assume for contradiction that a unit rule

A ⟶ B

is used at some point of the derivation

S \overset{*, G^{'}}{\Rightarrow} x

.

We can rewrite this derivation as

S \overset{*, G^{'}}{\Rightarrow} α A β \overset{1, G^{'}}{\Rightarrow} α B β \overset{*, G^{'}}{\Rightarrow} x

.

This

B

must be eventually gotten rid of before reaching the final product of

x {\in Σ}^{*}

and the production that we need for getting rid of

B

is:

η B θ \overset{1, G^{'}}{\Rightarrow} η γ θ

where

B ⟶ γ

is a rule in

G^{'}

.

We can now rewrite

S \overset{*, G^{'}}{\Rightarrow} x

as

S \overset{m, G^{'}}{\Rightarrow} α A β \overset{1, G^{'}}{\Rightarrow} α B β \overset{n, G^{'}}{\Rightarrow} η B θ \overset{1, G^{'}}{\Rightarrow} η γ θ \overset{k, G^{'}}{\Rightarrow} x

.

Since

A ⟶ B

and

B ⟶ γ

are rules in

R^{'}

,

A ⟶ γ

is a rule in

R^{'}

by construction (iii).

α A β \overset{1, G^{'}}{\Rightarrow} α γ β

is a valid production in

G^{'}

.

Furthermore, since

α B β \overset{n, G^{'}}{\Rightarrow} η B θ

, by Proposition 2.19, we can substitute

γ

for

B

to obtain the following valid production:

α γ β \overset{n, G^{'}}{\Rightarrow} η γ θ

.

By applying these two new productions at the corresponding points of the derivation of

x

, we have the following derivation:

S \overset{m, G^{'}}{\Rightarrow} α A β \overset{1, G^{'}}{\Rightarrow} α γ β \overset{n, G^{'}}{\Rightarrow} η γ θ \overset{k, G^{'}}{\Rightarrow} x

.

This new derivation has a length of

k + m + n + 1

which is shorter than the original one of

k + m + n + 2

.

This contradicts the assumption that the original given derivation of

x

is of minimum length.

Combining Claim 1 and Claim 2, we can conclude Lemma 2.35.

We now examine a method for converting a

C F G

into one in Chomsky Normal form.

Definition 2.36 (The Method

(M)

).

From every

C F G

,

G = (V, Σ, R, S)

, that doesn’t have

ϵ

-rules or unit rules, we can construct a

C F G

,

G^{'} = (V^{'}, Σ, R^{'}, S)

using a method called Method

(M)

as described in the following steps:

Step 1

For every

a \in Σ

, create a variable

A_{a}

and a rule

A_{a} ⟶ a

. Note that

A_{a}

is a newly and uniquely created variable such that

A_{a} \notin V

and

A_{a} \neq A_{b}

for any

a, b \in Σ

such that

a \neq b

.

Step 2

\forall r \in R

,

r

can be expressed as

A ⟶ u_{1} u_{2} \dots u_{k}

where

A \in V

,

u_{1}, u_{2}, \dots u_{k} \in V \cup Σ

&

k \geq 0

. Create a set of rules (called

P (r)

) and a set of nodes (called

V (r)

) according to the following steps:

(i): For $k = 0$

$r$ becomes $A ⟶ ϵ$ .

Since $R$ doesn’t have any $ϵ$ -rule, except $S ⟶ ϵ$ , $A$ must be equal to $S$ and $r$ becomes $S ⟶ ϵ$ .

Copy $S ⟶ ϵ$ into $P (r)$ .

In this case, $P (r) = \{S ⟶ ϵ\} = {r}$ and $V (r) = \emptyset$ .
(ii): For $k = 1$

$r$ becomes $A ⟶ u_{1}$ .

Since $R$ doesn’t have any unit rule, $u_{1} \in Σ$ .

Copy $r$ into $P (r)$ .

In this case, $P (r) = \{A ⟶ u_{1}\} = {r}$ and $V (r) = \emptyset$ .
(iii): For $k = 2$

$r$ becomes $A ⟶ u_{1} u_{2}$ .

If $u_{1}, u_{2} \in V$ , copy $r$ into $P (r)$ . In this case, $P (r) = \{A ⟶ u_{1} u_{2}\} = {r}$ and $V (r) = \emptyset$ .

If $u_{1} \in Σ$ & $u_{2} \in V$ , create $A ⟶ U_{u_{1}} u_{2}$ and add this rule and $U_{u_{1}} ⟶ u_{1}$ to $P (r)$ . Add $U_{u_{1}}$ to $V (r)$ .

(Note that $U_{u_{1}} ⟶ u_{1}$ was created in Step 1 above).

In this case, $P (r) = {A ⟶ U_{u_{1}} u_{2}, U_{u_{1}} ⟶ u_{1}}$ and $V (r) = {U_{u_{1}}}$ .

If $u_{1} \in V$ & $u_{2} \in Σ$ , create $A ⟶ u_{1} U_{u_{2}}$ and add this rule and $U_{u_{2}} ⟶ u_{2}$ to $P (r)$ . Add $U_{u_{2}}$ to $V (r)$ .

(Note that $U_{u_{2}} ⟶ u_{2}$ was created in Step 1 above).

In this case, $P (r) = {A ⟶ u_{1} U_{u_{2}}, U_{u_{2}} ⟶ u_{2}}$ and $V (r) = {U_{u_{2}}}$ .

If both $u_{1}, u_{2} \in Σ$ , create $A ⟶ U_{u_{1}} U_{u_{2}}$ and add it along with $U_{u_{1}} ⟶ u_{1}$ , $U_{u_{2}} ⟶ u_{2}$ to $P (r)$ . Add $U_{u_{1}}$ , $U_{u_{2}}$ to $V (r)$ .

(Note that $U_{u_{1}} ⟶ u_{1}$ and $U_{u_{2}} ⟶ u_{2}$ were created in Step 1 above).

In this case, $P (r) = {A ⟶ U_{u_{1}} U_{u_{2}}, U_{u_{1}} ⟶ u_{1}, U_{u_{2}} ⟶ u_{2}}$ and $V (r) = {U_{u_{1}}, U_{u_{2}}}$ .
(iv): For $k \geq 3$

Figure 2.7. Caption.

As depicted by the above figure, create the following rules and add them to

P (r)

.

A ⟶ U_{1} A_{1}

A_{1} ⟶ U_{2} A_{2}

A_{2} ⟶ U_{3} A_{3}

⋮

A_{i} ⟶ U_{i + 1} A_{i + 1}

⋮

A_{k - 2} ⟶ U_{k - 1} U_{k}

where

A_{1}, A_{2}, \dots A_{k - 2}

are variables newly and uniquely created for each

r

and therefore, they are not in

V

.

For any

i \in {1,2, 3, \dots k}

, if

u_{i} \in V

,

U_{i} = u_{i}

and if

u_{i} \in Σ

, set

U_{i} = U_{u_{i}}

and add

U_{u_{i}} ⟶ u_{i}

to

P (r)

. Add

U_{u_{i}}

to

V (r)

.

(Note that

U_{u_{i}} ⟶ u_{i}

for each

u_{i} \in Σ

were created in Step 1 above).

In this case,

P (r)

includes all the rules:

A ⟶ U_{1} A_{1}

A_{1} ⟶ U_{2} A_{2}

A_{2} ⟶ U_{3} A_{3}

⋮

A_{i} ⟶ U_{i + 1} A_{i + 1}

⋮

A_{k - 2} ⟶ U_{k - 1} U_{k}

and the rules

U_{u_{i}} ⟶ u_{i}

for any

u_{i} \in Σ

whereas

V (r) = \{U_{u_{i}}| u_{i} \in Σ\} \cup {A_{i} | i = 1,2, \dots k - 2}

Step 3

Set

V^{'} = V \cup ⋃_{r \in R} V (r)

And

R^{'} = ⋃_{r \in R} P (r)

We note the following properties of the rules created by Method

(M)

:

N1. All the rules in

R^{'}

are in Chomsky Normal Form.

N2. For any

r^{'} \in R^{'}

, there exists

r \in R

such that

r^{'} \in P (r)

. Furthermore,

P (r_{1}) \neq P (r_{2})

for any

r_{1}, r_{2} \in R

such that

r_{1} \neq r_{2}

.

N3. For any

r \in R

,

r

is equivalent to the rules in

P (r)

by Proposition 2.15.

N4.

V

and

⋃_{r \in R} V (r)

are disjoint. That is

V \cap ⋃_{r \in R} V (r) = \emptyset

.

N5. For any

r^{'} \in P (r)

, either

H e a d (r^{'}) = H e a d (r)

or

H e a d (r^{'}) \notin V

. Or equivalently,

H e a d (r^{'}) \in V ⟹ H e a d (r^{'}) = H e a d (r)

.

N6.

\forall r^{'} \in P (r)

, if

|B o d y (r^{'})| = 2

, then

r^{'}

is unique for

P (r)

. That is,

r^{'} \notin P (r_{1})

for any

r_{1} \in R

such that

r \neq r_{1}

.

N7. If

k = 0

or

|B o d y (r)| = 0

,

P (r) = \{S ⟶ ϵ\} = {r}

and

V (r) = \emptyset

.

N8. If

k = 1

or

|B o d y (r)| = 1

,

P (r) = \{A ⟶ u_{1}\} = {r}

and

V (r) = \emptyset

where

u_{1} \in Σ

.

We now have the following theorem.

Theorem 2.37. Every context-free language is generated by a

C F G

in Chomsky normal form (

C N F

).

Proof. Since every context-free language is generated by a

C F G

, we need to show that every

C F G

can be converted to an equivalent

C F G

in Chomsky normal form.

Also, because of Lemma 2.35, we can start with a

C F G

that has no

ϵ

-rule (

A ⟶ ϵ

where

A \neq S

) or unit rule (

A ⟶ B

where

A, B \in V

).

Let

G = (V, Σ, R, S)

be the

C F G

that has no

ϵ

-rule or unit rule except

S ⟶ ϵ

.

Let

G^{'} = (V^{'}, Σ, R^{'}, S)

be a

C F G

constructed from

G

by use of Method

(M)

.

In the following, we shall show

L (G) = L (G^{'})

by showing

x \in L (G) ⟺ x \in L (G^{'})

\forall x \in Σ^{*}

.

" ⟹ "

(If

x \in L (G)

)

S \overset{*, G}{\Rightarrow} x

.

\exists r_{1}, r_{2}, \dots r_{i}, \dots r_{n}, r_{n + 1} \in R

and

u_{1}, u_{2}, \dots u_{i}, \dots u_{n} \in {(V \cup Σ)}^{*}

such that

S \overset{r_{1}, G}{\Rightarrow} u_{1} \overset{r_{2}, G}{\Rightarrow} u_{2} \dots \dots u_{i - 1} \overset{r_{i}, G}{\Rightarrow} u_{i} \dots \dots \overset{r_{n}, G}{\Rightarrow} u_{n} \overset{r_{n + 1}, G}{\Rightarrow} x

.

By N3, for any

i \in {1,2, \dots, n + 1}

,

r_{i}

is equivalent to a sequence of rules from

P (r_{i})

which is a subset of

R^{'}

.

Therefore,

S \overset{P (r_{1}), G^{'}}{\Rightarrow} u_{1} \overset{{P (r}_{2}), G^{'}}{\Rightarrow} u_{2} \dots \dots u_{i - 1} \overset{P (r_{i}), G^{'}}{\Rightarrow} u_{i} \dots \dots \overset{P (r_{n}), G^{'}}{\Rightarrow} u_{n} \overset{{P (r}_{n + 1}), G^{'}}{\Rightarrow} x

.

Note that

u_{1}, u_{2}, \dots u_{i}, \dots u_{n} \in {(V^{'} \cup Σ)}^{*}

because

V \subset V^{'}

.

Therefore,

S \overset{*, G^{'}}{\Rightarrow} x

.

Therefore,

x \in L (G^{'})

.

“

⟸

” (If

x \in L (G^{'})

)

S \overset{*, G^{'}}{\Rightarrow} x

.

By Theorem 2.33,

\exists

a parse tree (in

G^{'}

) with root

S \in V^{'}

and a yield

x \in Σ^{*}

.

Let’s call this parse tree

(T^{'})

.

By definition of parse tree,

S

and its children must be the head and body of a rule in

R^{'}

.

Let’s call this rule

r^{'}

and hence

H e a d (r^{'}) = S

.

By N1,

r^{'}

must be in one of the following forms:

$S ⟶ ϵ$
$A ⟶ a$ where $a \in Σ$ , $A \in V^{'}$
$A ⟶ U_{1} U_{2}$ where $A \in V^{'}$ , $U_{1}, U_{2} \in V^{'} \ {S}$

If

r^{'}

is

S ⟶ ϵ

,

ϵ

is the only child of

S

.

Since

ϵ

has no children and

x

is a descendant of

S

, this is possible only if

ϵ = x

.

Furthermore, by construction of

(M)

,

S ⟶ ϵ

in

R^{'}

is created from

S ⟶ ϵ

in

R

.

Therefore,

S ⟶ ϵ

is also a rule in

R

.

Therefore,

S \overset{1, G}{\Rightarrow} ϵ

. (Proposition 2.8(i))

Therefore,

S \overset{1, G}{\Rightarrow} x

.

Therefore,

S \overset{*, G}{\Rightarrow} x

.

Therefore,

x \in L (G)

If

r^{'}

is

A ⟶ a

, since

S = H e a d (r^{'})

,

S = A

.

Therefore,

r^{'}

is

S ⟶ a

and

S

has only one child which is

a

.

Since

x

is a descendant of

S

and

a

has no children,

a = x

.

By construction of

(M)

,

A ⟶ a

in

R^{'}

is created from

A ⟶ a

in

R

.

Therefore,

A ⟶ a

is also a rule in

R

.

Therefore,

S ⟶ x

is a rule in

R

.

Therefore,

S \overset{1, G}{\Rightarrow} x

. (Proposition 2.8(i))

Therefore,

S \overset{*, G}{\Rightarrow} x

.

Therefore,

x \in L (G)

.

If

r^{'}

is

A ⟶ U_{1} U_{2}

where

A \in V^{'}

,

U_{1}, U_{2} \in V^{'} \ {S}

Since

H e a d (r^{'}) = S

and

S \in V

,

H e a d (r^{'}) \in V

.

Since

H e a d (r^{'}) = A

,

A = S

.

Therefore,

r^{'}

becomes

S ⟶ U_{1} U_{2}

where

U_{1}, U_{2} \in V^{'} \ {S}

.

By N2,

\exists r \in R

such that

r^{'} \in P (r)

.

Let

r

be

A^{'} ⟶ u_{1} u_{2} \dots u_{k}

where

A' \in V

,

u_{1}, u_{2}, \dots u_{k} \in V \cup Σ

.

By N5,

H e a d (r^{'}) \in V ⟹ H e a d (r^{'}) = H e a d (r)

.

Therefore,

S = A^{'}

.

Therefore,

r

becomes

S ⟶ u_{1} u_{2} \dots u_{k}

.

We now analyze the different situations for different values of

k

.

If

k = 0

,

r

becomes

S ⟶ ϵ

.

By construction of

(M)

,

P (r) = {S ⟶ ϵ}

.

Since

r^{'} \in P (r)

,

r^{'}

is

S ⟶ ϵ

.

This contradicts the underlying assumption that

r^{'}

is

A ⟶ U_{1} U_{2}

where

A \in V^{'}

,

U_{1}, U_{2} \in V^{'} \ {S}

.

Therefore,

k

cannot be

0

.

If

k = 1

,

r

becomes

S ⟶ u_{1}

.

Since

R

doesn’t have any unit rule,

u_{1} \in Σ

.

By construction of

(M)

,

P (r) = {S ⟶ u_{1}}

.

Therefore,

r^{'}

is

S ⟶ u_{1}

where

u_{1} \in Σ

.

This contradicts the underlying assumption that

r^{'}

is

A ⟶ U_{1} U_{2}

where

A \in V^{'}

,

U_{1}, U_{2} \in V^{'} \ {S}

.

Therefore,

k

cannot be

1

.

Therefore, we can exclude the cases of

k \in {0,1}

under the assumption that

r^{'}

is

A ⟶ U_{1} U_{2}

where

A \in V^{'}

,

U_{1}, U_{2} \in V^{'} \ {S}

.

If

k = 2

,

r

becomes

S ⟶ u_{1} u_{2}

.

By construction of

(M)

,

P (r)

is one of the following:

(i): $P (r) = {S ⟶ u_{1} u_{2}}$ if $u_{1}, u_{2} \in V$
(ii): $P (r) = {S ⟶ U_{u_{1}} u_{2}, U_{u_{1}} ⟶ u_{1}}$ if $u_{1} \in Σ & u_{2} \in V$
(iii): $P (r) = {S ⟶ u_{1} U_{u_{2}}, U_{u_{2}} ⟶ u_{2}}$ if $u_{1} \in V & u_{2} \in Σ$
(iv): $P (r) = {S ⟶ U_{u_{1}} U_{u_{2}}, U_{u_{1}} ⟶ u_{1}, U_{u_{2}} ⟶ u_{2}}$ if $u_{1}, u_{2} \in Σ$

For (i),

r^{'}

is

S ⟶ u_{1} u_{2}

.

In this case,

r

and

r^{'}

are the same and the sub parse tree in

(T^{'})

with root

S

and children

u_{1}, u_{2}

as shown on the right of the following figure can be replaced by a parse tree in

G

with the same root and children as shown on the left.

Figure 2.8. Caption.

For (ii),

r^{'}

is either

S ⟶ U_{u_{1}} u_{2}

or

U_{u_{1}} ⟶ u_{1}

.

However, since

H e a d (r^{'}) = S

which is in

V

and

U_{u_{1}} \notin V

,

r^{'}

cannot be

U_{u_{1}} ⟶ u_{1}

.

r^{'}

must be

S ⟶ U_{u_{1}} u_{2}

.

By N3,

S ⟶ u_{1} u_{2}

is equivalent to

S ⟶ U_{u_{1}} u_{2}

and

U_{u_{1}} ⟶ u_{1}

.

We have the following equivalent parse trees with the same root and yield.

Figure 2.9. Caption.

The one on the left is a parse tree in

G

whose root and its children are the head and body of a rule in

R

whereas the one on the right is a sub parse tree of

(T^{'})

.

Therefore we can replace a sub parse tree of

(T^{'})

with an equivalent parse tree in

G

whose root and yield are the head and body of a rule in

R

.

For (iii), by a similar argument, we have the following equivalent parse trees with the same root and yield.

Figure 2.10. Caption.

The one on the left is a parse tree in

G

whose root and its children are the head and body of a rule in

R

whereas the one on the right is a sub parse tree of

(T^{'})

.

Therefore we can replace a sub parse tree of

(T^{'})

with an equivalent parse tree in

G

whose root and yield are the head and body of a rule in

R

.

For (iv), by a similar argument, we have the following equivalent parse trees with the same root and yield.

Figure 2.11. Caption.

The one on the left is a parse tree in

G

whose root and its children are the head and body of a rule in

R

whereas the one on the right is a sub parse tree of

(T^{'})

.

Therefore we can replace a sub parse tree of

(T^{'})

with an equivalent parse tree in

G

whose root and yield are the head and body of a rule in

R

.

If

k \geq 3

,

r

is

S ⟶ u_{1} u_{2} \dots u_{k}

.

P (r)

consists of the following rules:

S ⟶ U_{1} A_{1}

A_{1} ⟶ U_{2} A_{2}

A_{2} ⟶ U_{3} A_{3}

⋮

A_{i} ⟶ U_{i + 1} A_{i + 1}

⋮

A_{k - 2} ⟶ U_{k - 1} U_{k}

U_{u_{i}} ⟶ u_{i}

if

u_{i} \in Σ

\forall i \in {1,2, 3, \dots k}

where

U_{i} = u_{i}

if

u_{i} \in V

and

U_{i} = U_{u_{i}}

if

u_{i} \in Σ

.

Since

H e a d (r^{'}) = S

,

r^{'}

is

S ⟶ U_{1} A_{1}

.

Since

(T^{'})

is a parse tree of

G^{'}

, by definition of parse tree,

U_{1} A_{1}

are children of

S

.

U_{2} A_{2}

are children of

A_{1}

.

U_{3} A_{3}

are children of

A_{2}

.

⋮

U_{k - 1} U_{k}

are children of

A_{k - 2}

.

By N3,

r

is equivalent to the sequence of rules contained in

P (r)

.

Therefore, we have the following equivalent parse trees with the same root and yield.

Figure 2.12. Caption.

The one on the left is a parse tree in

G

whose root and its children are the head and body of a rule in

R

whereas the one on the right is a sub parse tree of

(T^{'})

.

Therefore we can replace a sub parse tree of

(T^{'})

with an equivalent parse tree in

G

whose root and yield are the head and body of a rule in

R

.

Combining all cases, we conclude that there is a sub parse tree in

(T^{'})

with root

S

that can be replaced by an equivalent parse tree in

G

whose root and yield are the head and body of a rule in

R

.

We can write this rule in

R

as

S ⟶ u_{1} u_{2} \dots u_{k}

where

k \geq 0

and

u_{i} \in V \cup Σ

for

i \in {1,2, \dots k}

(a) If all

u_{i}

’s are terminals

In this case,

u_{1} u_{2} \dots u_{k} = x

, the yield of the parent tree

(T^{'})

.

The reason is that a leaf of a subtree is also a leaf of the parent tree.

Therefore,

u_{i} ⊏ x \forall i \in {1,2, \dots k}

.

On the other hand, if

l

is a leaf in

x

, there is a simple directed path from

S

to

l

. This simple directed path must pass through one of the nodes

u_{1}, u_{2} \dots u_{k}

because

u_{1} u_{2} \dots u_{k}

is the yield of a sub parse tree in

(T^{'})

which is obtained by branching out from

S

in all possible directions.

Therefore,

l

must be one of the nodes

u_{1}, u_{2} \dots u_{k}

.

After replacement, we now have a new tree which is a parse tree in

G

, and furthermore, the root and yield of this tree are respectively

S

and

x

.

By Theorem 2.33,

S \overset{*, G}{\Rightarrow} x

.

Therefore,

x \in L (G)

.

(b) If some

u_{i}

’s are variables

For each

u_{i}

that is a variable, we can repeat the above replacement process to replace the sub parse tree (with root

u_{i}

) in

(T^{'})

with a parse tree in

G

whose root (

u_{i}

) and the root’s children are the head and body of a rule in

R

.

Since every time we do a replacement, we get down to a lower level of

(T^{'})

and since the height of

(T^{'})

and the number of subtrees of

(T^{'})

are both finite, this process of replacement must come to a stop after a finite number of operations. When this happens, we have a new tree in which every interior node and its children are the head and body of a rule in

R

. This means that the new tree thus created is a parse tree in

G

.

Furthermore, this replacement process only affects the nodes which are variables. Therefore, the yield of

(T^{'})

, namely

x

, is untouched and remains at the bottom after the replacement is complete.

This means that

x

is also the yield of the newly created tree.

We now have a new tree with root

S

and yield

x

and the tree is also a parse tree in

G

.

By Theorem 2.33,

S \overset{*, G}{\Rightarrow} x

.

Therefore,

x \in L (G)

.

Combining (a) and (b), we complete the proof of Theorem 2.37.

On the basis of Theorem 2.37 and the results proved in Lemma 2.35, we can now develop a set of operational rules for the conversion of a

C F G

to one in

C N F

.

Let

G = (V, Σ, R, S)

be the

C F G

to be converted.

Let

G^{'} = (V^{'}, Σ, R^{'}, S_{0})

be the

C F G

to be created in

C N F

.

C R - 1 .

Create

S_{0} ⟶ S

and add it to

R^{'}

.

(Note that this creation will ensure that the start variable will not occur on the right hand side of a rule.)

C R - 2 .

(Elimination of

ϵ

-rules)

If

\exists

a rule

B ⟶ ϵ

in

R

, do the following:

(i)

For every rule in

R

in the form

A ⟶ u_{1} B u_{2} B u_{3} B u_{4} \dots \dots u_{n - 1} B u_{n} B u_{n + 1}

(1): For each single occurrence of $B$ , on the $R H S$ , create a rule with that occurrence deleted and add it to $R^{'}$ .

For example, $A ⟶ u_{1} u_{2} B u_{3} B u_{4} \dots \dots u_{n - 1} B u_{n} B u_{n + 1}$

$A ⟶ u_{1} B u_{2} u_{3} B u_{4} \dots \dots u_{n - 1} B u_{n} B u_{n + 1}$

$⋮$

$A ⟶ u_{1} B u_{2} B u_{3} B u_{4} \dots \dots u_{n - 1} B u_{n} u_{n + 1}$
(2): For each group occurrence of $2 B ’ s$ on the $R H S$ , create a rule with that group occurrence deleted and add it to $R^{'}$ .

For example, $A ⟶ u_{1} u_{2} u_{3} B u_{4} \dots \dots u_{n - 1} B u_{n} B u_{n + 1}$

$A ⟶ u_{1} u_{2} B u_{3} u_{4} \dots \dots u_{n - 1} B u_{n} B u_{n + 1}$

$⋮$

$A ⟶ u_{1} B u_{2} B u_{3} B u_{4} \dots \dots u_{n - 1} u_{n} u_{n + 1}$ .

$⋮$

$⋮$

(n) For each group occurrence of $n B ’ s$ on the $R H S$ , create a rule with that group occurrence deleted and add it to $R^{'}$ .

For example, $A ⟶ u_{1} u_{2} u_{3} u_{4} \dots \dots u_{n - 1} u_{n} u_{n + 1}$ .

(ii)

Repeat (i) until all rules of the form of

B ⟶ ϵ

are eliminated.

C R - 3 .

(Elimination of unit rules)

If

\exists

rules

A ⟶ B

and

B ⟶ u

in

R

, do the following:

(i): Create $A ⟶ u$ and add it to $R^{'}$ .
(ii): Copy $B ⟶ u$ to $R^{'}$ .
(iii): Do not copy $A ⟶ B$ to $R^{'}$ .
(iv): Repeat (i) and (ii) until all unit rules of the form $A ⟶ B$ are eliminated.

C R - 4 .

(Conversion of remaining rules)

For every remaining rule

A

in

R

,

A ⟶ u_{1} u_{2} \dots \dots u_{k}

where each

u_{i} \in V \cup Σ

for

i \in {1,2, \dots \dots k}

.

Create in

R^{'}

the following sequence of rules and add the corresponding created variables to

V^{'}

:

A ⟶ U_{1} A_{1}

A_{1} ⟶ U_{2} A_{2}

A_{2} ⟶ U_{3} A_{3}

⋮

A_{k - 2} ⟶ U_{k - 1} A_{k - 1}

A_{k - 1} ⟶ U_{k}

where

U_{i} = u_{i}

if

u_{i} \in V

and if

u_{i} \in Σ

, add

U_{i} ⟶ u_{i}

.

Example 2.38. Let

G = (V, Σ, R, S)

be the

C F G

consisting of the following rules:

S ⟶ A S A | a B

A ⟶ B | S

B ⟶ b | ϵ

Convert

G

to

G^{'} = (V^{'}, Σ, R^{'}, S_{0})

in

C N F

.

Step 1. (Applying

C R - 1

.)

S_{0} ⟶ S

S ⟶ A S A | a B

A ⟶ B | S

B ⟶ b | ϵ

Step 2. (Removing

B ⟶ ϵ

using

C R - 2

)

S_{0} ⟶ S

S ⟶ A S A | a B | a

A ⟶ B | S | ϵ

B ⟶ b

Step 3 (Removing

A ⟶ ϵ

using

C R - 2

)

S_{0} ⟶ S

S ⟶ A S A | a B | a | S A | A S | S

A ⟶ B | S

B ⟶ b

Step 4 (Removing

S ⟶ S

because of redundancy)

S_{0} ⟶ S

S ⟶ A S A | a B | a | S A | A S

A ⟶ B | S

B ⟶ b

Step 5 (Removing

S_{0} ⟶ S

using

C R - 3

)

S_{0} ⟶ A S A | a B | a | S A | A S

S ⟶ A S A | a B | a | S A | A S

A ⟶ B | S

B ⟶ b

Step 6 (Removing

A ⟶ B

using

C R - 3

)

S_{0} ⟶ A S A | a B | a | S A | A S

S ⟶ A S A | a B | a | S A | A S

A ⟶ b | S

B ⟶ b

Step 7 (Removing

A ⟶ S

using

C R - 3

)

S_{0} ⟶ A S A | a B | a | S A | A S

S ⟶ A S A | a B | a | S A | A S

A ⟶ b | A S A | a B | a | S A | A S

B ⟶ b

Step 8 (Conversion of remaining rules into

C N F

)

Since

S_{0} ⟶ A S A ⟺ \{\begin{matrix} S_{0} ⟶ A A_{1} \\ A_{1} ⟶ S A \end{matrix}

and

S_{0} ⟶ a B ⟺ \{\begin{matrix} S_{0} ⟶ U B \\ U ⟶ a \end{matrix}

,

S ⟶ A S A ⟺ \{\begin{matrix} S ⟶ A A_{1} \\ A_{1} ⟶ S A \end{matrix}

and

S ⟶ a B ⟺ \{\begin{matrix} S ⟶ U B \\ U ⟶ a \end{matrix}

, and

A ⟶ A S A ⟺ \{\begin{matrix} A ⟶ A A_{1} \\ A_{1} ⟶ S A \end{matrix}

and

A ⟶ a B ⟺ \{\begin{matrix} A ⟶ U B \\ U ⟶ a \end{matrix}

,

the rules in

R^{'}

now become

S_{0} ⟶ A A_{1} | U B | a | S A | A S

S ⟶ A A_{1} | U B | a | S A | A S

A ⟶ b | A A_{1} | U B | a | S A | A S

B ⟶ b

A_{1} ⟶ S A

U ⟶ a

Example 2.39.

Convert

S ⟶ a S b | ϵ

to

C N F

where

S \in V

and

a, b \in Σ

and show that there is more than one way of deriving the string

a^{2} b^{2}

using rules in

C N F

.

Conversion of rules.

S ⟶ a S b | ϵ

⟺ S ⟶ a S b | a b

⟺ \{\begin{matrix} S ⟶ A S B | A B \\ A ⟶ a \\ B ⟶ b \end{matrix}

⟺ \{\begin{matrix} S ⟶ A C | A B; C ⟶ S B \\ A ⟶ a; B ⟶ b \end{matrix}

Derivation of a²b²

There is more than one way of deriving the string

a^{2} b^{2}

. Below are a few examples.

(i): $S \overset{S ⟶ A C}{\Rightarrow} A C \overset{C ⟶ S B}{\Rightarrow} A S B \overset{A ⟶ a}{\Rightarrow} a S B \overset{B ⟶ b}{\Rightarrow} a S b \overset{S ⟶ A B}{\Rightarrow} a A B b \overset{A ⟶ a}{\Rightarrow} a a B b \overset{B ⟶ b}{\Rightarrow} a a b b$ .
(ii): $S \overset{S ⟶ A C}{\Rightarrow} A C \overset{C ⟶ S B}{\Rightarrow} A S B \overset{S ⟶ A B}{\Rightarrow} A A B B \overset{A ⟶ a}{\Rightarrow} a A B B \overset{A ⟶ a}{\Rightarrow} a a B B \overset{B ⟶ b}{\Rightarrow} a a b B \overset{B ⟶ b}{\Rightarrow} a a b b$ .
(iii): $S \overset{S ⟶ A C}{\Rightarrow} A C \overset{C ⟶ S B}{\Rightarrow} A S B \overset{S ⟶ A B}{\Rightarrow} A A B B \overset{B ⟶ b}{\Rightarrow} A A B b \overset{B ⟶ b}{\Rightarrow} A A b b \overset{A ⟶ a}{\Rightarrow} A a b b \overset{A ⟶ a}{\Rightarrow} a a b b$ .

2.3. Pushdown Automata ( $P D A$ )

Pushdown automata is another kind of nondeterministic computation model similar to nondeterministic finite automata except that they have an extra component called stack. The purpose of the stack is to provide additional memory beyond what is available in finite automata.

Pushdown automata are equivalent in power to context-free grammars which will be proved later. In addition to reading symbols from the input alphabet

Σ

, a

P D A

also reads and writes symbols on the stack. Writing and reading on the stack must be done at the top. Either symbol from input or stack can be

ϵ

thereby allowing the machine to move without actually reading or writing. Upon reading a symbol from the input alphabet, the

P D A

decides to make one of the following moves on the stack before entering the next state:

(i) Replace

Replace the symbol at the top of the stack with another symbol. This move is referred to as the “Replace” move.

(ii) Push

Add a symbol to the top of the stack. This move is referred to as the “Push” move.

(iii) Pop

Erase or remove a symbol from the top of the stack. This move is referred to as the “Pop” move.

(iv) Untouched

Do nothing to change the stack. This move is referred to as “Untouched” move.

A

P D A

is formally defined as follows.

Definition 2.40.

A

P D A

is a 7-tuple,

M = (Q, Σ, Γ, δ, q_{0}, ⟂, F)

where

Q, Σ, Γ & F

are finite sets such that

(a)

Q

is the set of states

(b)

Σ

is the input alphabet

(c)

Γ

is the stack alphabet

(d)

δ : Q \times Σ_{ϵ} \times Γ_{ϵ} ⟶ ℘ (Q \times Γ_{ϵ})

is the transition function

(e)

q_{0} \in Q

is the start state

(f)

⟂ \in Γ

is the initial stack symbol signaling an empty stack

(g)

F \subset Q

is the set of accept states.

M

computes as follows.

Let

w = w_{1} w_{2} \dots w_{m}

where

w_{i} \in Σ_{ϵ}

for

1 \leq i \leq m

.

M

accepts

w

iff

\exists r_{0}, r_{1} \dots r_{m} \in Q

and

s_{0}, s_{1} \dots s_{m} \in Γ^{*}

such that the following conditions are satisfied:

(i)

r_{0} = q_{0}

and

s_{0} = ⟂

(ii)

(r_{i + 1}, b_{i}) \in δ (r_{i}, w_{i + 1}, a_{i})

) for

0 \leq i \leq m - 1

where

a_{i}, b_{i} \in Γ_{ϵ}

and

s_{i} = a_{i} t_{i}

,

s_{i + 1} = b_{i} t_{i}

where

t_{i} \in Γ^{*}

(iii)

r_{m} \in F

When

m = 0, w = ϵ

and only conditions (i) and (iii) are valid which then becomes

r_{0} = q_{0}

and

s_{0} = ⟂

and

r_{0} \in F

.

Therefore, we define a

P D A

to accept

ϵ

whenever the start state is also an accept state and the stack is signaled to be empty.

If we write

r_{i} \overset{w_{i + 1}, a_{i} ⟶ b_{i}, δ}{\to} r_{i + 1}

for

(r_{i + 1}, b_{i}) \in δ (r_{i}, w_{i + 1}, a_{i})

, conditions (i), (ii) and (iii) can be written as follows:

q_{0} = r_{0} \overset{w_{1}, a_{0} ⟶ b_{0}, δ}{\to} r_{1} \overset{w_{2}, a_{1} ⟶ b_{1}, δ}{\to} r_{2} \dots r_{i} \overset{w_{i + 1}, a_{i} ⟶ b_{i,} δ}{\to} r_{i + 1} \dots r_{m - 1} \overset{w_{m}, a_{m - 1} ⟶ b_{m - 1,} δ}{\to} r_{m}, r_{m} \in F

.

When there is only one transition function under consideration, the showing of

δ

in the computation is usually omitted and the following shorthand is used instead:

q_{0} = r_{0} \overset{w_{1}, a_{0} ⟶ b_{0}}{\to} r_{1} \overset{w_{2}, a_{1} ⟶ b_{1}}{\to} r_{2} \dots r_{i} \overset{w_{i + 1}, a_{i} ⟶ b_{i}}{\to} r_{i + 1} \dots r_{m - 1} \overset{w_{m}, a_{m - 1} ⟶ b_{m - 1}}{\to} r_{m}, r_{m} \in F

.

For simplicity, we sometimes can use the notation

q_{0} \overset{w, *, δ}{\to} r_{m}

to represent a computation of

w

from

q_{0}

to

r_{m}

without showing the intermediate states.

We now can use the transition function to describe the four basic moves of the

P D A

as mentioned above:

(i) Replace

r \overset{a, b ⟶ c}{\to} r^{'}

signifies a replacement of

b

by

c

at the top of the stack upon reading symbol

a

from input.

(ii) Push

r \overset{a, ϵ ⟶ c}{\to} r^{'}

signifies adding the symbol

c

to the top of the stack upon reading symbol

a

from input.

(iii) Pop

r \overset{a, b ⟶ ϵ}{\to} r^{'}

signifies removing the symbol

b

from the top of the stack upon reading symbol

a

from input.

(iv) Untouched

r \overset{a, ϵ ⟶ ϵ}{\to} r^{'}

signifies nothing is done to change the stack upon reading symbol

a

from input.

We further note that when

a = ϵ

,

r \overset{ϵ, ϵ ⟶ ϵ}{\to} r^{'}

signifies a change of state from

r

to

r^{'}

with no input read and no change made to the stack.

Example 2.41.

Let

M = (Q, Σ, Γ, δ, q_{1}, ⟂, F)

be a

P D A

where

Q = {q_{1}, q_{2}, q_{3}, q_{4}}

,

Σ = {0,1}

,

Γ = {0, ⟂, $}

,

F = {q_{1}, q_{4}}

with the following state diagram:

M

recognizes the language

{0^{n} 1^{n} | n \geq 0}

.

If the stack is signaled to be empty at the beginning,

M

accepts the empty string (

ϵ = 0^{0} 1^{0}

), because

q_{1}

is both a start and accept state. Furthermore, if the input string is not empty at the start state, the

P D A

would not read anything from the string except to push

$

onto the stack.

M

accepts the string

0^{3} 1^{3}

with the following computation:

q_{1} \overset{ϵ, ϵ ⟶ $}{\to} q_{2} \overset{0, ϵ ⟶ 0}{\to} q_{2} \overset{0, ϵ ⟶ 0}{\to} q_{2} \overset{0, ϵ ⟶ 0}{\to} q_{2} \overset{1,0 ⟶ ϵ}{\to} q_{3} \overset{1,0 ⟶ ϵ}{\to} q_{3} \overset{1,0 ⟶ ϵ}{\to} q_{3} \overset{ϵ, $ ⟶ ϵ}{\to} q_{4}

,

q_{4} \in F

.

Note that the above illustration is not a proof that

M

recognizes the language

{0^{n} 1^{n} | n \geq 0}

. To make such a proof, one must argue that every string of the form

0^{n} 1^{n}

is accepted by

M

and every string accepted by

M

is of the form

0^{n} 1^{n}

.

Note also that the steps

q_{1} \overset{ϵ, ϵ ⟶ $}{\to} q_{2}

and

q_{3} \overset{ϵ, $ ⟶ ϵ}{\to} q_{4}

can be replaced by

q_{1} \overset{ϵ, ϵ ⟶ ϵ}{\to} q_{2}

and

q_{3} \overset{ϵ, ϵ ⟶ ϵ}{\to} q_{4}

to transition to another state without making a change to the stack.

Example 2.42.

Let

M = (Q, Σ, Γ, δ, q_{1}, ⟂, F)

be a

P D A

where

Q = {q_{1}, q_{2}, q_{3}, q_{4}, q_{5}, q_{6}, q_{7}}

,

Σ = {a, b, c}

,

Γ = {a, ⟂, $}

,

F = {q_{4}, q_{7}}

with the following state diagram:

M

recognizes the language

\{a^{i} b^{j} c^{k}| i, j, k \geq 0 a n d i = j o r i = k}

.

M

accepts the empty string (

ϵ = a^{0} b^{0} c^{0}

) with the following computation:

q_{1} \overset{ϵ, ϵ ⟶ $}{\to} q_{2} \overset{ϵ, ϵ ⟶ ϵ}{\to} q_{3} \overset{ϵ, $ ⟶ ϵ}{\to} q_{4}

,

q_{4} \in F

.

M

accepts the string

a^{2} b^{2} c^{3}

with the following computation:

q_{1} \overset{ϵ, ϵ ⟶ $}{\to} q_{2} \overset{a, ϵ ⟶ a}{\to} q_{2} \overset{a, ϵ ⟶ a}{\to} q_{2} \overset{ϵ, ϵ ⟶ ϵ}{\to} q_{3} \overset{b, a ⟶ ϵ}{\to} q_{3} \overset{b, a ⟶ ϵ}{\to} q_{3} \overset{ϵ, $ ⟶ ϵ}{\to} q_{4} \overset{c, ϵ ⟶ ϵ}{\to} q_{4} \overset{c, ϵ ⟶ ϵ}{\to} q_{4} \overset{c, ϵ ⟶ ϵ}{\to} q_{4}

,

q_{4} \in F

.

M

accepts the string

a^{2} b^{3} c^{2}

with the following computation:

q_{1} \overset{ϵ, ϵ ⟶ $}{\to} q_{2} \overset{a, ϵ ⟶ a}{\to} q_{2} \overset{a, ϵ ⟶ a}{\to} q_{2} \overset{ϵ, ϵ ⟶ ϵ}{\to} q_{5} \overset{b, ϵ ⟶ ϵ}{\to} q_{5} \overset{b, ϵ ⟶ ϵ}{\to} q_{5} \overset{b, ϵ ⟶ ϵ}{\to} q_{5} \overset{ϵ, ϵ ⟶ ϵ}{\to} q_{6} \overset{c, a ⟶ ϵ}{\to} q_{6} \overset{c, a ⟶ ϵ}{\to} q_{6} \overset{ϵ, $ ⟶ ϵ}{\to} q_{7}

,

q_{7} \in F

.

Note also that the computations

q_{1} \overset{ϵ, ϵ ⟶ $}{\to} q_{2}

,

q_{3} \overset{ϵ, $ ⟶ ϵ}{\to} q_{4}

, and

q_{6} \overset{ϵ, $ ⟶ ϵ}{\to} q_{7}

can be replaced by

q_{1} \overset{ϵ, ϵ ⟶ ϵ}{\to} q_{2}

and

q_{3} \overset{ϵ, ϵ ⟶ ϵ}{\to} q_{4}

and

q_{6} \overset{ϵ, ϵ ⟶ ϵ}{\to} q_{7}

to transition to another state without making a change to the stack.

Example 2.43.

Let

M = (Q, Σ, Γ, δ, q_{1}, ⟂, F)

be a

P D A

where

Q = {q_{1}, q_{2}, q_{3}, q_{4}}

,

Σ = {0,1}

,

Γ = {0,1, ⟂, $}

,

F = {q_{1}, q_{4}}

with the following state diagram:

M

recognizes the language

{w w^{R} | w \in {\{0,1\}}^{*}}

.

If the stack is signaled to be empty at the beginning,

M

accepts the empty string

(

ϵ = ϵ ϵ^{R}

), because

q_{1}

is both a start and accept state.

M

accepts the string

001100

with the following computation:

q_{1} \overset{ϵ, ϵ ⟶ $}{\to} q_{2} \overset{0, ϵ ⟶ 0}{\to} q_{2} \overset{0, ϵ ⟶ 0}{\to} q_{2} \overset{1, ϵ ⟶ 1}{\to} q_{2} \overset{ϵ, ϵ ⟶ ϵ}{\to} q_{3} \overset{1,1 ⟶ ϵ}{\to} q_{3} \overset{0,0 ⟶ ϵ}{\to} q_{3} \overset{0,0 ⟶ ϵ}{\to} q_{3} \overset{ϵ, $ ⟶ ϵ}{\to} q_{4}

,

q_{4} \in F

.

Note also that the steps

q_{1} \overset{ϵ, ϵ ⟶ $}{\to} q_{2}

and

q_{3} \overset{ϵ, $ ⟶ ϵ}{\to} q_{4}

can be replaced by

q_{1} \overset{ϵ, ϵ ⟶ ϵ}{\to} q_{2}

and

q_{3} \overset{ϵ, ϵ ⟶ ϵ}{\to} q_{4}

to transition to another state without making a change to the stack.

Instead of writing symbols one at a time to the stack, we can actually design

P D A

s which can write a string of symbols to the stack in one step. These

P D A

s are called extended

P D A

s. It turns out that the two kinds of

P D A

s are equivalent in power in that given one, we can construct the other such that the two recognize the same language. The equivalence of these two kinds of

P D A

s will be proved later.

Definition 2.44.

An extended

P D A

is a 7-tuple,

M_{E} = (Q, Σ, Γ, \hat{δ}, q_{0}, ⟂, F)

where

Q, Σ, Γ & F

are finite sets such that

(a)

Q

is the set of states

(b)

Σ

is the input alphabet

(c)

Γ

is the stack alphabet

(d)

\hat{δ} : Q \times Σ_{ϵ} \times Γ_{ϵ} ⟶ ℘ (Q \times Γ^{*})

is the transition function

(e)

q_{0} \in Q

is the start state

(f)

⟂ \in Γ

is the initial stack symbol signaling an empty stack

(g)

F \subset Q

is the set of accept states.

M

computes as follows.

Let

w = w_{1} w_{2} \dots w_{m}

where

w_{i} \in Σ_{ϵ}

for

1 \leq i \leq m

.

M

accepts

w

iff

\exists r_{0}, r_{1} \dots r_{m} \in Q

and

s_{0}, s_{1} \dots s_{m} \in Γ^{*}

such that the following conditions are satisfied:

(1)

r_{0} = q_{0}

and

s_{0} = ⟂

(2)

(r_{i + 1}, b_{i}) \in \hat{δ} (r_{i}, w_{i + 1}, a_{i})

) for

0 \leq i \leq m - 1

where

a_{i} \in Γ_{ϵ}

,

b_{i} \in Γ^{*}

and

s_{i} = a_{i} t_{i}

,

s_{i + 1} = b_{i} t_{i}

where

t_{i} \in Γ^{*}

(3)

r_{m} \in F

When

m = 0, w = ϵ

and only conditions (i) and (iii) are valid which then becomes

r_{0} = q_{0}

and

s_{0} = ⟂

and

r_{0} \in F

.

Therefore, we define the extended

P D A

to accept

ϵ

whenever the start state is also an accept state and the stack is signaled to be empty.

If we write

r_{i} \overset{w_{i + 1}, a_{i} ⟶ b_{i}, \hat{δ}}{\to} r_{i + 1}

for

(r_{i + 1}, b_{i}) \in \hat{δ} (r_{i}, w_{i + 1}, a_{i})

, conditions (i), (ii) and (iii) can be written as follows:

q_{0} = r_{0} \overset{w_{1}, a_{0} ⟶ b_{0}, \hat{δ}}{\to} r_{1} \overset{w_{2}, a_{1} ⟶ b_{1}, \hat{δ}}{\to} r_{2} \dots r_{i} \overset{w_{i + 1}, a_{i} ⟶ b_{i,} \hat{δ}}{\to} r_{i + 1} \dots r_{m - 1} \overset{w_{m}, a_{m - 1} ⟶ b_{m - 1,} \hat{δ}}{\to} r_{m}, r_{m} \in F

.

When there is only one transition function under consideration, the showing of

\hat{δ}

in the computation is usually omitted and the following shorthand is used instead:

q_{0} = r_{0} \overset{w_{1}, a_{0} ⟶ b_{0}}{\to} r_{1} \overset{w_{2}, a_{1} ⟶ b_{1}}{\to} r_{2} \dots r_{i} \overset{w_{i + 1}, a_{i} ⟶ b_{i}}{\to} r_{i + 1} \dots r_{m - 1} \overset{w_{m}, a_{m - 1} ⟶ b_{m - 1}}{\to} r_{m}, r_{m} \in F

.

For simplicity, we sometimes can use the notation

q_{0} \overset{w, *, \hat{δ}}{\to} r_{m}

to represent a computation of

w

from

q_{0}

to

r_{m}

without showing the intermediate states.

Theorem 2.45.

For any extended

P D A

, (

M_{E}

), there is a

P D A (M

), such that

L (M_{E}) = L (M)

and vice versa.

Proof.

Construction of

M

from

M_{E}

.

Let

M_{E} = (Q_{E}, Σ, Γ, \hat{δ}, q_{0}, ⟂, F)

be an extended

P D A

.

Construct

P D A

,

M = (Q, Σ, Γ, δ, q_{0}, ⟂, F)

where

Q

and

δ

are to be defined as follows.

For every

(q, a, s) \in Q_{E} \times Σ_{ϵ} \times Γ_{ϵ}

, we define

δ (q, a, s)

as follows.

If

\hat{δ} (q, a, s) = \emptyset

,

δ (q, a, s) = \emptyset

.

If

\hat{δ} (q, a, s) \neq \emptyset

,

\exists

at least one

(r, u) \in \hat{δ} (q, a, s)

.

Let

δ_{1} (q, a, s) = {(r, ϵ) | (r, ϵ) \in \hat{δ} (q, a, s)}

.

\forall (r, u) \in \hat{δ} (q, a, s)

where

(r, u) \in Q_{E} \times Γ^{*}

and

u \neq ϵ

,

\exists u_{1}, u_{2} \dots u_{l} \in Γ

,

l \geq 1

such that

u = u_{1} u_{2} \dots u_{l}

.

(Note that none of

u_{1}, u_{2} \dots u_{l}

is

ϵ

.)

Create new states

q_{1}, q_{2}, \dots q_{l - 1}

that satisfy the following conditions:

q \overset{a, s ⟶ u_{l}, δ}{\to} q_{1}

(by making

{(q}_{1}, u_{l}) \in δ (q, a, s)

)

q_{1} \overset{ϵ, ϵ ⟶ u_{l - 1}, δ}{\to} q_{2}

(by making

δ (q_{1}, ϵ, ϵ) = {q_{2}, u_{l - 1}}

)

q_{2} \overset{ϵ, ϵ ⟶ u_{l - 2}, δ}{\to} q_{3}

(by making

δ (q_{2}, ϵ, ϵ) = {q_{3}, u_{l - 2}}

)

⋮

q_{l - 1} \overset{ϵ, ϵ ⟶ u_{1}, δ}{\to} r

(by making

δ (q_{l - 1}, ϵ, ϵ) = {r, u_{1}}

)

Note that the states

q_{1}, q_{2}, \dots q_{l - 1}

thus created are not in

Q_{E}

and that

δ (q_{i}, a, s) = \emptyset

for any other combinations of

(a, s) \neq (ϵ, ϵ)

and

i \in {1,2, \dots l - 1}

.

Note also that there can be more than one set of states

q_{1}, q_{2}, \dots q_{l - 1}

and stack symbols

u_{1}, u_{2} \dots u_{l}

to be created from each combination of

(q, a, s)

because there can be more than one

(r, u) \in \hat{δ} (q, a, s)

based on which the states and the stack symbols are created.

Let

δ_{2} (q, a, s) = ⋃_{(r, u) \in \hat{δ} (q, a, s)} {(q_{1}, u_{l})}

where

u = u_{1} u_{2} \dots u_{l}

,

l \geq 1

,

u_{i} \in Γ

and

q_{1}

is created from (ii) above.

Let

δ (q, a, s) = δ_{1} (q, a, s) \cup δ_{2} (q, a, s)

.

For each

(q, a, s, r, u) \in Q_{E} \times Σ_{ϵ} \times Γ_{ϵ} \times Q_{E} \times Γ^{*}

, where

(r, u) \in \hat{δ} (q, a, s)

and

u \neq ϵ

, define

Q (q, a, s, r, u) = {q_{i} | 1 \leq i \leq l - 1; l & q_{i} a r e c r e a t e d f r o m (i i); (r, u) \in \hat{δ} (q, a, s); u \neq ϵ}

(Note that

q_{i} \in Q (q, a, s, r, u) ⟹ q_{i} \notin Q_{E}

.)

Let

P (q, a, s) = ⋃_{(r, u) \in \hat{δ} (q, a, s); u \neq ϵ} Q (q, a, s, r, u)

Set

Q = Q_{E} \cup (⋃_{(q, a, s) \in Q_{E} \times Σ_{ϵ} \times Γ_{ϵ}} P (q, a, s))

.

So,

M = (Q_{E} \cup (⋃_{(q, a, s) \in Q_{E} \times Σ_{ϵ} \times Γ_{ϵ}} P (q, a, s)), Σ, Γ, δ, q_{0}, ⟂, F)

.

The construction is now complete and it remains to show that

L (M_{E}) = L (M)

.

Suppose

w \in L (M)

.

\exists w_{1}, w_{2} \dots w_{n} \in Σ_{ϵ}

such that

w = w_{1}, w_{2} \dots w_{n}

where

n \geq 1

.

\exists r_{0}, r_{1} \dots r_{n} \in Q

,

a_{i}, b_{i} \in Γ_{ϵ}

such that

q_{0} = r_{0} \overset{w_{1}, a_{0} ⟶ b_{0}, δ}{\to} r_{1} \overset{w_{2}, a_{1} ⟶ b_{1}, δ}{\to} r_{2} \dots r_{i} \overset{w_{i + 1}, a_{i} ⟶ b_{i,} δ}{\to} r_{i + 1} \dots r_{n - 1} \overset{w_{n}, a_{n - 1} ⟶ b_{n - 1,} δ}{\to} r_{n}, r_{n} \in F

.

Claim:

\forall 0 \leq i \leq n - 1,

if

r_{i} \in Q_{E}

, then

\exists j

and

u \in Γ^{*}

such that

i < j \leq n

,

r_{i} \overset{w_{i + 1}, a_{i} ⟶ u, \hat{δ}}{\to} r_{j}

and

w_{k} = ϵ

for

i + 2 \leq k \leq j

.

Proof of Claim.

From

r_{i} \overset{w_{i + 1}, a_{i} ⟶ b_{i,} δ}{\to} r_{i + 1}

in the given computation, it follows that

(r_{i + 1}, b_{i,}) \in δ (r_{i}, w_{i + 1}, a_{i})

.

By assumption,

r_{i} \in Q_{E}

.

By construction (ii),

δ (r_{i}, w_{i + 1}, a_{i}) = δ_{1} (r_{i}, w_{i + 1}, a_{i}) \cup δ_{2} (r_{i}, w_{i + 1}, a_{i})

.

Either

(r_{i + 1}, b_{i,}) \in δ_{1} (r_{i}, w_{i + 1}, a_{i})

or

(r_{i + 1}, b_{i,}) \in δ_{2} (r_{i}, w_{i + 1}, a_{i})

.

If

(r_{i + 1}, b_{i,}) \in δ_{1} (r_{i}, w_{i + 1}, a_{i})

Since

δ_{1} (r_{i}, w_{i + 1}, a_{i}) = {(r, ϵ) | (r, ϵ) \in \hat{δ} (r_{i}, w_{i + 1}, a_{i})}

,

b_{i,} = ϵ

and

(r_{i + 1}, ϵ) \in \hat{δ} (r_{i}, w_{i + 1}, a_{i})

.

Therefore,

r_{i} \overset{w_{i + 1}, a_{i} ⟶ ϵ, \hat{δ}}{\to} r_{i + 1}

.

Since

i < i + 1 \leq n

and

ϵ \in Γ^{*}

, Claim is proved by taking

j = i + 1

and

u = ϵ

.

If

(r_{i + 1}, b_{i,}) \in δ_{2} (r_{i}, w_{i + 1}, a_{i})

Since

δ_{2} (r_{i}, w_{i + 1}, a_{i}) = ⋃_{(r, u) \in \hat{δ} (r_{i}, w_{i + 1}, a_{i})} {(q_{1}, u_{l})}

,

(r_{i + 1}, b_{i,}) = (q_{1}, u_{l})

for some

(r, u) \in \hat{δ} (r_{i}, w_{i + 1}, a_{i})

where

u = u_{1} u_{2} \dots u_{l}

,

l \geq 1

,

u_{i} \in Γ

and

q_{1}

is created from construction (ii) above.

Therefore

r_{i + 1} = q_{1}

and

b_{i,} = u_{l}

.

r_{i} \overset{w_{i + 1}, a_{i} ⟶ b_{i,} δ}{\to} r_{i + 1}

now becomes

r_{i} \overset{w_{i + 1}, a_{i} ⟶ u_{l}, δ}{\to} r_{i + 1}

.

Furthermore, from

r_{i + 1} \overset{w_{i + 2}, a_{i + 1} ⟶ b_{i + 1,} δ}{\to} r_{i + 2}

in the given computation, we have

(r_{i + 2}, b_{i + 1}) \in δ (r_{i + 1}, w_{i + 2}, a_{i + 1}) = δ (q_{1}, w_{i + 2}, a_{i + 1})

.

Since

δ (q_{1}, a, s) = \emptyset

for all (

a, s

)

\neq

(

ϵ, ϵ

), we must have

w_{i + 2} = a_{i + 1} = ϵ

.

Therefore,

(r_{i + 2}, b_{i + 1}) \in δ (q_{1}, ϵ, ϵ) = {(q_{2}, u_{l - 1})}

.

Therefore,

r_{i + 2} = q_{2}

and

b_{i + 1} = u_{l - 1}

.

r_{i + 1} \overset{w_{i + 2}, a_{i + 1} ⟶ b_{i + 1,} δ}{\to} r_{i + 2}

becomes

r_{i + 1} \overset{ϵ, ϵ ⟶ u_{l - 1}, δ}{\to} r_{i + 2}

.

By repeating the above argument, we can obtain the following computation:

r_{i} \overset{w_{i + 1}, a_{i} ⟶ u_{l}, δ}{\to} r_{i + 1} \overset{ϵ, ϵ ⟶ u_{l - 1}, δ}{\to} r_{i + 2} \overset{ϵ, ϵ ⟶ u_{l - 2}, δ}{\to} r_{i + 3} \dots r_{i + l - 1} \overset{ϵ, ϵ ⟶ u_{1}, δ}{\to} r_{i + l}

.

where

r_{i + 1} = q_{1}

,

r_{i + 2} = q_{2}

,

\dots

r_{i + l - 1} = q_{l - 1}

,

r_{i + l} = r

and

w_{i + 2} = w_{i + 3} \dots = w_{i + l} = ϵ

.

Let

j = i + l

.

r_{j} = r_{i + 1} = r

and

r \in Q_{E} ⟹ r_{j} \in Q_{E}

.

Also,

w_{i + 2} = w_{i + 3} \dots = w_{j} = ϵ

.

(r, u) = (r_{j}, u)

.

Since

(r, u) \in \hat{δ} (r_{i}, w_{i + 1}, a_{i})

,

(r_{j}, u) \in \hat{δ} (r_{i}, w_{i + 1}, a_{i})

.

Therefore,

r_{i} \overset{w_{i + 1}, a_{i} ⟶ u, \hat{δ}}{\to} r_{j}

.

l \geq 1 ⟹ i + l > i ⟹ j > i

.

Assume for contradiction that

j > n

.

i < n \leq j - 1

.

i < n \leq i + l - 1

.

Therefore

r_{n} \in \{r_{i + 1}, r_{i + 2}, r_{i + 3} \dots r_{i + l - 1}\} = {q_{1}, q_{2}, q_{3}, \dots q_{l - 1}}

.

This implies

r_{n} \notin Q_{E}

, which is a contradiction because

r_{n} \in F

and

F \subset Q_{E}

.

Therefore,

i < j \leq n

.

Claim is also true under condition (b).

Combining (a) and (b), we conclude the proof of Claim.

Since

r_{0} = q_{0} \in Q_{E}

, we can apply Claim on

r_{0}

to obtain

j_{0}

such that

0 < j_{0} \leq n

;

w_{2} = w_{3} = \dots = w_{j_{0}} = ϵ

;

r_{0} \overset{w_{1}, a_{0} ⟶ u_{0}, \hat{δ}}{\to} r_{j_{0}}

with

r_{j_{0}}

also in

Q_{E}

and

u_{0} \in Γ^{*}

.

Since

r_{j_{0}} \in Q_{E}

, we can again apply Claim on

r_{j_{0}}

to get

r_{j_{1}}

such that

0 < j_{0} < j_{1} \leq n

;

w_{j_{0} + 2} = w_{j_{0} + 3} \dots = w_{j_{1}} = ϵ

;

r_{j_{0}} \overset{w_{j_{0} + 1}, a_{j_{0}} ⟶ u_{j_{0}}, \hat{δ}}{\to} r_{j_{1}}

with

r_{j_{1}}

also in

Q_{E}

&

u_{j_{0}} \in Γ^{*}

.

By repeating this process a number of times, we will obtain

0 < j_{0} < j_{1} < \dots < j_{m - 1} < j_{m} \leq n

such that

r_{0} \overset{w_{1}, a_{0} ⟶ u_{0}, \hat{δ}}{\to} r_{j_{0}} \overset{w_{j_{0} + 1}, a_{j_{0}} ⟶ u_{j_{0}}, \hat{δ}}{\to} r_{j_{1}} \dots \dots r_{j_{m - 1}} \overset{w_{j_{m - 1} + 1}, a_{j_{m - 1}} ⟶ u_{j_{m - 1}}, \hat{δ}}{\to} r_{j_{m}}

, where

u_{0}, u_{j_{0}} \dots u_{j_{m - 1}} \in Γ^{*}

.

Since

n

is finite, this process of creation must stop at some point and at this point,

j_{m} = n

.

Therefore,

M_{E}

accepts

w_{1} w_{j_{0} + 1} w_{j_{1} + 1} \dots w_{j_{m - 1} + 1}

.

By Claim, we have

w_{2} = w_{3} = \dots = w_{j_{0}} = ϵ

w_{j_{0} + 2} = w_{j_{0} + 3} \dots = w_{j_{1}} = ϵ

⋮

w_{j_{m - 1} + 2} = w_{j_{m - 1} + 3} \dots = w_{j_{m}} = ϵ

where

j_{m} = n

.

Therefore,

w_{1} w_{j_{0} + 1} w_{j_{1} + 1} \dots w_{j_{m - 1} + 1} = w_{1} w_{2} \dots w_{n} = w

.

Therefore,

M_{E}

accepts

w_{1}, w_{2} \dots w_{n} = w

.

Therefore,

w \in L (M_{E})

and hence

L (M) \subset L (M_{E})

.

Conversely, assume

w \in L (M_{E})

.

\exists r_{0}, r_{1} \dots r_{n} \in Q_{E};

a_{i} \in Γ_{ϵ}

,

b_{i} \in Γ^{*}

for

0 \leq i \leq n - 1

;

w_{1}, w_{2} \dots w_{n} \in Σ_{ϵ}

such that

{w = w}_{1} w_{2} \dots w_{n}

and

q_{0} = r_{0} \overset{w_{1}, a_{0} ⟶ b_{0}, \hat{δ}}{\to} r_{1} \overset{w_{2}, a_{1} ⟶ b_{1}, \hat{δ}}{\to} r_{2} \dots r_{i} \overset{w_{i + 1}, a_{i} ⟶ b_{i,} \hat{δ}}{\to} r_{i + 1} \dots r_{n - 1} \overset{w_{n}, a_{n - 1} ⟶ b_{n - 1,} \hat{δ}}{\to} r_{n}, r_{n} \in F

.

Since

Q_{E} \subset Q

,

r_{0}, r_{1} \dots r_{n} \in Q

.

For all

0 \leq i \leq n - 1

,

(r_{i + 1}, b_{i}) \in \hat{δ} (r_{i}, w_{i + 1}, a_{i}) ⟹ \hat{δ} (r_{i}, w_{i + 1}, a_{i}) \neq \emptyset

.

[If

b_{i} = ϵ

]

(r_{i + 1}, ϵ) \in \hat{δ} (r_{i}, w_{i + 1}, a_{i})

By construction (ii),

δ_{1} (r_{i}, w_{i + 1}, a_{i}) = {(r, ϵ) | (r, ϵ) \in \hat{δ} (r_{i}, w_{i + 1}, a_{i})}

.

Therefore,

(r_{i + 1}, ϵ) \in δ_{1} (r_{i}, w_{i + 1}, a_{i})

.

Also by construction (ii),

δ (r_{i}, w_{i + 1}, a_{i}) = δ_{1} (r_{i}, w_{i + 1}, a_{i}) \cup δ_{2} (r_{i}, w_{i + 1}, a_{i})

.

Therefore,

(r_{i + 1}, ϵ) \in δ (r_{i}, w_{i + 1}, a_{i})

.

Therefore,

r_{i} \overset{w_{i + 1}, a_{i} ⟶ ϵ, δ}{\to} r_{i + 1}

.

Therefore,

r_{i} \overset{w_{i + 1}, *, δ}{\to} r_{i + 1}

.

[If

b_{i} \neq ϵ

]

\exists b_{i} (1), b_{i} (2) \dots b_{i} (l) \in Γ

,

l \geq 1

such that

b_{i} = b_{i} (1) b_{i} (2) \dots b_{i} (l)

.

By construction (ii),

\exists q_{1}, q_{2}, \dots q_{l - 1} \in Q

such that

r_{i} \overset{w_{i + 1}, a_{i} ⟶ b_{i} (l), δ}{\to} q_{1}

q_{1} \overset{ϵ, ϵ ⟶ b_{i} (l - 1), δ}{\to} q_{2}

q_{2} \overset{ϵ, ϵ ⟶ b_{i} (l - 2), δ}{\to} q_{3}

⋮

q_{l - 1} \overset{ϵ, ϵ ⟶ b_{i} (1), δ}{\to} r_{i + 1}

.

Therefore,

r_{i} \overset{w_{i + 1}, *, δ}{\to} r_{i + 1}

.

Combining both cases of [

b_{i} = ϵ

] and [

b_{i} \neq ϵ

], we have

r_{i} \overset{w_{i + 1}, *, δ}{\to} r_{i + 1}

for all

0 \leq i \leq n - 1

.

Therefore,

q_{0} = r_{0} \overset{w_{1}, *, δ}{\to} r_{1} \overset{w_{2}, *, δ}{\to} r_{2} \dots r_{i} \overset{w_{i + 1}, * δ}{\to} r_{i + 1} \dots r_{n - 1} \overset{w_{n}, * δ}{\to} r_{n}, r_{n} \in F

.

Therefore,

M

accepts

w_{1} w_{2} \dots w_{n} = w

.

w \in L (M)

.

L (M_{E}) \subset L (M)

.

This completes the proof of

L (M_{E}) = L (M)

for the construction of

M

from

M_{E}

.

Construction of

M_{E}

from

M

.

Let

M = (Q, Σ, Γ, δ, q_{0}, ⟂, F)

be a

P D A

.

Construct

M_{E} = (Q, Σ, Γ, \hat{δ}, q_{0}, ⟂, F)

where

\hat{δ} : Q \times Σ_{ϵ} \times Γ_{ϵ} ⟶ ℘ (Q \times Γ^{*})

such that

\forall (q, a, s) \in Q \times Σ_{ϵ} \times Γ_{ϵ}, \hat{δ} (q, a, s) = δ (q, a, s)

.

(Note that this is possible because

Γ_{ϵ} \subset Γ^{*}

.)

It remains to show that

L (M_{E}) = L (M)

.

Let

w = w_{1} w_{2} \dots w_{n}

where

w_{i} \in Σ_{ϵ}

for

1 \leq i \leq n

&

n \geq 1

.

Suppose

w \in L (M)

.

\exists r_{0}, r_{1} \dots r_{n} \in Q

,

a_{i} \in Γ_{ϵ}

,

b_{i} \in Γ_{ϵ}

for

0 \leq i \leq n - 1

such that

q_{0} = r_{0} \overset{w_{1}, a_{0} ⟶ b_{0}, δ}{\to} r_{1} \overset{w_{2}, a_{1} ⟶ b_{1}, δ}{\to} r_{2} \dots r_{i} \overset{w_{i + 1}, a_{i} ⟶ b_{i,} δ}{\to} r_{i + 1} \dots r_{n - 1} \overset{w_{n}, a_{n - 1} ⟶ b_{n - 1,} δ}{\to} r_{n}, r_{n} \in F

.

For

0 \leq i \leq n - 1

,

since

r_{i} \overset{w_{i + 1}, a_{i} ⟶ b_{i,} δ}{\to} r_{i + 1}

,

(r_{i + 1}, b_{i}) \in δ (r_{i}, w_{i + 1}, a_{i})

.

since

\hat{δ} (r_{i}, w_{i + 1}, a_{i}) = δ (r_{i}, w_{i + 1}, a_{i})

,

(r_{i + 1}, b_{i}) \in \hat{δ} (r_{i}, w_{i + 1}, a_{i})

.

Therefore,

r_{i} \overset{w_{i + 1}, a_{i} ⟶ b_{i,} \hat{δ}}{\to} r_{i + 1}

for

0 \leq i \leq n - 1

.

Therefore,

q_{0} = r_{0} \overset{w_{1}, a_{0} ⟶ b_{0}, \hat{δ}}{\to} r_{1} \overset{w_{2}, a_{1} ⟶ b_{1}, \hat{δ}}{\to} r_{2} \dots r_{i} \overset{w_{i + 1}, a_{i} ⟶ b_{i,} \hat{δ}}{\to} r_{i + 1} \dots r_{n - 1} \overset{w_{n}, a_{n - 1} ⟶ b_{n - 1,} \hat{δ}}{\to} r_{n}, r_{n} \in F

.

M_{E}

accepts

w

.

w \in L (M_{E})

.

L (M) \subset L (M_{E})

.

Conversely, suppose

w \in L (M_{E})

.

q_{0} = r_{0} \overset{w_{1}, a_{0} ⟶ b_{0}, \hat{δ}}{\to} r_{1} \overset{w_{2}, a_{1} ⟶ b_{1}, \hat{δ}}{\to} r_{2} \dots r_{i} \overset{w_{i + 1}, a_{i} ⟶ b_{i,} \hat{δ}}{\to} r_{i + 1} \dots r_{n - 1} \overset{w_{n}, a_{n - 1} ⟶ b_{n - 1,} \hat{δ}}{\to} r_{n}, r_{n} \in F

,

where

a_{i} \in Γ_{ϵ}

,

b_{i} \in Γ^{*}

for

0 \leq i \leq n - 1

.

For

0 \leq i \leq n - 1

,

r_{i} \overset{w_{i + 1}, a_{i} ⟶ b_{i,} \hat{δ}}{\to} r_{i + 1} ⟹ (r_{i + 1}, b_{i}) \in \hat{δ} (r_{i}, w_{i + 1}, a_{i})

.

Since

\hat{δ} (r_{i}, w_{i + 1}, a_{i}) = δ (r_{i}, w_{i + 1}, a_{i})

,

(r_{i + 1}, b_{i}) \in δ (r_{i}, w_{i + 1}, a_{i})

.

Therefore,

r_{i} \overset{w_{i + 1}, a_{i} ⟶ b_{i,} δ}{\to} r_{i + 1}

for

0 \leq i \leq n - 1

.

Therefore,

q_{0} = r_{0} \overset{w_{1}, a_{0} ⟶ b_{0}, δ}{\to} r_{1} \overset{w_{2}, a_{1} ⟶ b_{1}, δ}{\to} r_{2} \dots r_{i} \overset{w_{i + 1}, a_{i} ⟶ b_{i,} δ}{\to} r_{i + 1} \dots r_{n - 1} \overset{w_{n}, a_{n - 1} ⟶ b_{n - 1,} δ}{\to} r_{n}, r_{n} \in F

.

Therefore

M

accepts

w

and hence

w \in L (M)

.

Therefore,

L (M_{E}) \subset L (M)

.

This completes the proof of

L (M_{E}) = L (M)

for the construction of

M_{E}

from

M

.

Combining (A) and (B), we conclude the proof of Theorem 2.45.

Now that we have proved the equivalence of

P D A

and extended

P D A

, we shall no longer distinguish between

M

and

M_{E}

or between

δ

and

\hat{δ}

. From here on, we shall be using extended

P D A

exclusively because it is a much more convenient tool for solving problems. We shall be using

M

and

δ

for all

P D A

s with the understanding that the

P D A

s that we are dealing with can write a string to the stack in one single step.

Definition 2.46 (Configurations of a

P D A

). A configuration of a

P D A, M = (Q, Σ, Γ, δ, q_{0}, ⟂, F)

is an element of

Q \times Σ^{*} \times Γ^{*}

describing the current state, the portion of the input still unread and the current stack contents at some point of a computation. For example, the configuration

(p, b a a a b b a, A B A C ⟂)

describes the situation as shown in the following diagram.

Note that the portion of the input to the left of the input head, namely

a b a b

, has been read and cannot affect the computation hereon.

The start configuration on input

w

is defined as

(q_{0}, w, ⟂)

. That is, the

P D A

always starts in its start state

q_{0}

, with the input head pointing to the leftmost input symbol and the stack containing only the start stack symbol

⟂

.

The next-configuration relation (denoted by

\overset{1, M}{\to}

or simply

\overset{M}{\to}

) describes how the

P D A

moves from one configuration to another in one step. It is formally defined as follows.

Definition 2.47. Let

M = (Q, Σ, Γ, δ, q_{0}, ⟂, F)

be a

P D A

.

\forall p, q \in Q, a \in Σ_{ϵ}, A \in Γ_{ϵ}, y \in Σ^{*}, β \in Γ^{*}, γ \in Γ^{*}

,

(p \overset{a, A ⟶ γ, δ}{\to} q) \overset{d e f}{\Leftrightarrow} ((p, a y, A β) \overset{1, M}{\to} (q, y, γ β))

.

For any configurations

C

,

D

of

M

,

(C \overset{0, M}{\to} D) \overset{d e f}{\Leftrightarrow} (C = D)

.

(C \overset{n + 1, M}{\to} D) \overset{d e f}{\Leftrightarrow} (\exists E C \overset{n, M}{\to} E & E \overset{1, M}{\to} D)

.

(C \overset{*, M}{\to} D) \overset{d e f}{\Leftrightarrow} (\exists n \geq 0 C \overset{n, M}{\to} D)

.

Proposition 2.48.

Let

M = (Q, Σ, Γ, δ, q_{0}, ⟂, F)

be a

P D A

.

For any

w = w_{1} w_{2} \dots w_{m}

where

w_{1}, w_{2} \dots w_{m} \in Σ_{ϵ}

,

M

accepts

w

iff

(q_{0}, w, ⟂) \overset{*, M}{\to} (q, ϵ, γ)

for some

q \in F

and

γ \in Γ^{*}

.

Proof.

By Definition 2.44,

M

accepts

w = w_{1} w_{2} \dots w_{m}

iff

\exists r_{0}, r_{1} \dots r_{m} \in Q

,

a_{i} \in Γ_{ϵ}

,

b_{i} \in Γ^{*}

for

0 \leq i \leq m - 1

such that

q_{0} = r_{0} \overset{w_{1}, a_{0} ⟶ b_{0}}{\to} r_{1} \overset{w_{2}, a_{1} ⟶ b_{1}}{\to} r_{2} \dots r_{i} \overset{w_{i + 1}, a_{i} ⟶ b_{i}}{\to} r_{i + 1} \dots r_{m - 1} \overset{w_{m}, a_{m - 1} ⟶ b_{m - 1}}{\to} r_{m}, r_{m} \in F

.

By Definition 2.47, each one-step transitional movement is equivalent to one step of configuration movement.

For all

0 \leq i \leq m - 1

,

\exists

configurations

C_{i}

and

C_{i + 1}

such that

(r_{i} \overset{w_{i + 1}, a_{i} ⟶ b_{i}}{\to} r_{i + 1}) ⟺ (C_{i} \overset{1, M}{\to} C_{i + 1})

The above transitional computation is equivalent to

C_{0} \overset{1, M}{\to} C_{1} \overset{1, M}{\to} C_{2} \dots C_{i} \overset{1, M}{\to} C_{i + 1} \dots C_{m - 1} \overset{1, M}{\to} C_{m}

,

r_{m} \in F

.

That is,

C_{0} \overset{n, M}{\to} C_{m}

,

r_{m} \in F

.

That is,

C_{0} \overset{*, M}{\to} C_{m}

,

r_{m} \in F

.

Since

C_{0} = (q_{0}, w, ⟂)

and

C_{m} = (r_{m}, ϵ, γ)

where

γ \in Γ^{*}

is the final stack content, we have

(q_{0}, w, ⟂) \overset{*, M}{\to} (r_{m}, ϵ, γ)

,

r_{m} \in F

.

Therefore,

(q_{0}, w, ⟂) \overset{*, M}{\to} (q, ϵ, γ)

for some

q \in F

and

γ \in Γ^{*}

.

This completes the proof of Proposition 2.48.

Proposition 2.49.

Let

M = (Q, Σ, Γ, δ, q_{0}, ⟂, F)

and

M^{'} = (Q^{'}, Σ, Γ^{'}, δ^{'}, q_{0}^{'}, ⟂^{'}, F^{'})

be two

P D A

s such that

Q \subset Q^{'}

,

Γ \subset Γ^{'}

and

δ^{'} (q, a, A) = δ (q, a, A) \forall (q, a, A) \in Q \times Σ_{ϵ} \times Γ_{ϵ}

.

\forall p, q \in Q

,

u, v \in Σ^{*}

and

α, β \in Γ^{*}

, the following statements hold:

(a)

[(p, u, α) \overset{1, M}{\to} (q, v, β)] ⟺ [(p, u, α) \overset{1, M^{'}}{\to} (q, v, β)]

(b)

[(p, u, α) \overset{n, M}{\to} (q, v, β)] ⟺ [(p, u, α) \overset{n, M^{'}}{\to} (q, v, β)]

for any

n \geq 0

Proof.

(a)

(p, u, α) \overset{1, M}{\to} (q, v, β) ⟺ p \overset{a, A ⟶ γ, δ}{\to} q

where

u = a v

,

a \in Σ_{ϵ}

,

α = A η

,

β = γ η

,

A \in Γ_{ϵ}

,

γ, η \in Γ^{*}

.

Since

Q \subset Q^{'}

,

p, q \in Q ⟹ p, q \in Q^{'}

.

Since

Γ \subset Γ^{'}

,

A \in Γ_{ϵ} ⟹ A \in Γ_{ϵ}^{'}

.

Since

Γ \subset Γ^{'}

,

Γ^{*} \subset {(Γ^{'})}^{*}

and hence

γ \in Γ^{*} ⟹ γ \in {(Γ^{'})}^{*}

.

Since

δ^{'} (p, a, A) = δ (p, a, A)

for all

(p, a, A) \in Q \times Σ_{ϵ} \times Γ_{ϵ}

, we have

p \overset{a, A ⟶ γ, δ}{\to} q ⟹ p \overset{a, A ⟶ γ, δ^{'}}{\to} q

⟹ (p, a v, A η,) \overset{1, M^{'}}{\to} (q, v, γ η)

⟹ (p, u, α) \overset{1, M^{'}}{\to} (q, v, β)

Conversely,

(p, u, α) \overset{1, M^{'}}{\to} (q, v, β)

where

p, q \in Q

,

u, v \in Σ^{*}

and

α, β \in Γ^{*}

⟺ p \overset{a, A ⟶ γ, δ^{'}}{\to} q

where

p, q \in Q

,

u, v \in Σ^{*}

,

α, β \in Γ^{*}

,

a \in Σ_{ϵ}

,

A \in Γ_{ϵ}^{'}

,

γ \in {(Γ^{'})}^{*}

,

α = A η

,

β = γ η

.

Since

α \in Γ^{*}

&

α = A η

,

A \in Γ_{ϵ}

and

η \in Γ^{*}

.

Since

β \in Γ^{*}

&

β = γ η

,

γ \in Γ^{*}

and

η \in Γ^{*}

.

Therefore

(p, a, A) \in Q \times Σ_{ϵ} \times Γ_{ϵ}

and hence

δ^{'} (p, a, A) = δ (p, a, A)

.

Therefore,

p \overset{a, A ⟶ γ, δ^{'}}{\to} q ⟹ p \overset{a, A ⟶ γ, δ}{\to} q ⟹ (p, u, α) \overset{1, M}{\to} (q, v, β)

.

(b)

This part can be proved by using the result of (a) along with an induction argument on the number of steps.

This completes the proof of Proposition 2.49.

Proposition 2.50.

Let

M = (Q, Σ, Γ, δ, q_{0}, ⟂, F)

be a

P D A

. It is true that

\forall p, q \in Q, x, y, w \in Σ^{*}, α, β, γ \in Γ^{*},

integer

n \geq 1

,

((p, x, α) \overset{n, M}{\to} (q, y, β)) ⟹ ((p, x w, α γ) \overset{n, M}{\to} (q, y w, β γ))

Proof.

The proof is by induction on

n

.

For

n = 1

, assume

(p, x, α) \overset{1, M}{\to} (q, y, β)

.

\exists a \in Σ_{ϵ}

,

A \in Γ_{ϵ}

and

η, θ \in Γ^{*}

such that

x = a y

,

α = A η

,

β = θ η

and

p \overset{a, A ⟶ θ, δ}{\to} q

.

Since

p \overset{a, A ⟶ θ, δ}{\to} q

, and

x w = a y w

,

α γ = A η γ

, and

β γ = θ η γ

,

(p, x w, α γ) \overset{1, M}{\to} (q, y w, β γ)

.

Therefore, the statement is true for

n = 1

.

For induction hypothesis,

((p, x, α) \overset{k, M}{\to} (q, y, β)) ⟹ ((p, x w, α γ) \overset{k, M}{\to} (q, y w, β γ))

for any integer

k \geq 1

.

For

n = k + 1

, assume

(p, x, α) \overset{k + 1, M}{\to} (q, y, β)

.

\exists p^{'} \in Q,

x^{'} \in Σ^{*}

,

α^{'} \in Γ^{*}

such that

(p, x, α) \overset{k, M}{\to} (p^{'}, x^{'}, α^{'})

and

(p^{'}, x^{'}, α^{'}) \overset{1, M}{\to} (q, y, β)

.

By induction hypothesis, we have

(p, x w, α γ) \overset{k, M}{\to} (p^{'}, x^{'} w, α^{'} γ)

.

Since the statement is true for

n = 1

, we also have

(p^{'}, x^{'} w, α^{'} γ) \overset{1, M}{\to} (q, y w, β γ)

.

Combining the two computations, we have

(p, x w, α γ) \overset{k + 1, M}{\to} (q, y w, β γ)

.

This completes the proof of Proposition 2.50.

The

P D A

s that we have dealt with thus far accept an input by entering an accept state upon reading the entire input. We call this kind of

P D A

a

P D A

that accepts by final state. There is another kind of

P D A

that accepts an input by popping the last symbol off the stack (without pushing any other symbol back on) upon reading the entire input. We call this kind of

P D A

a

P D A

that accepts by empty stack. It turns out that the two kinds of

P D A

s are equivalent in that given one, we can construct the other such that the two recognize the same language. Before we prove the equivalence of these two kinds of

P D A

s, we need a formal definition for

P D A

s that accept by empty stack.

Definition 2.51.

A

P D A

that accepts by empty stack is a 6-tuple,

M_{e} = (Q, Σ, Γ, δ, q_{0}, ⟂_{e})

where

Q, Σ, Γ, δ, q_{0}, ⟂_{e}

are defined similarly as in a

P D A

that accepts by final state.

M_{e}

computes as follows:

Let

w = w_{1} w_{2} \dots w_{m}

where

w_{i} \in Σ_{ϵ}

for

1 \leq i \leq m

&

m \geq 1

.

M_{e}

accepts

w

iff

(q_{0}, w, ⟂_{e}) \overset{*, M_{e}}{\to} (q, ϵ, ϵ)

for any

q \in Q

.

(Note that the set of accept states, namely

F

, is not needed in the definition of acceptance by empty state.)

Lemma 2.52. For any

P D A

,

M_{e}

, that accepts by empty stack, there is a

P D A

,

M_{f}

, that accepts by final state such that

L (M_{e}) = L (M_{f})

.

Proof.

Let

M_{e} = (Q, Σ, Γ, δ, q_{0}, ⟂_{e})

where

⟂_{e} \in Γ

is the initial stack symbol of

M_{e}

.

Construct

M_{f} = (Q_{f}, Σ, Γ_{f}, δ_{f}, q_{s t a r t}, ⟂_{f}, \{q_{a c c e p t}\})

where

q_{s t a r t}

and

q_{a c c e p t}

are newly created states (not in

Q

) with

q_{s t a r t}

serving as the start state of

M_{f}

and

\{q_{a c c e p t}\}

serving as the set of accept states of

M_{f}

.

⟂_{f}

is a newly created stack symbol (not in

Γ

) serving as the initial stack symbol of

M_{f}

.

Q_{f} = Q \cup {q_{s t a r t}, q_{a c c e p t}}

Γ_{f} = Γ \cup {⟂_{f}}

The transition function

δ_{f}

of

M_{f}

is defined as follows.

T1:

δ_{f} (q_{s t a r t}, ϵ, ⟂_{f}) = \{(q_{0}, ⟂_{e} ⟂_{f})\}

(⟺ q_{s t a r t} \overset{ϵ, ⟂_{f} ⟶ ⟂_{e} ⟂_{f}, δ_{f}}{\to} q_{0})

T2:

δ_{f} (q, ϵ, ⟂_{f}) = \{(q_{a c c e p t}, ϵ)\} \forall q \in Q

(⟺ q \overset{ϵ, ⟂_{f} ⟶ ϵ, δ_{f}}{\to} q_{a c c e p t})

T3:

δ_{f} (q, a, A) = δ (q, a, A)

for any

(q, a, A) \in Q \times Σ_{ϵ} \times Γ_{ϵ}

where

δ : Q \times Σ_{ϵ} \times Γ_{ϵ} ⟶ ℘ (Q \times Γ^{*})

.

T4:

δ_{f} (q, a, A) = \emptyset

for any other

(q, a, A) \in Q_{f} \times Σ_{ϵ} \times {(Γ_{f})}_{ϵ}

.

The construction is now complete. It remains to show

L (M_{e}) = L (M_{f})

.

Suppose

w \in L (M_{e})

.

(q_{0}, w, ⟂_{e}) \overset{n, M_{e}}{\to} (q, ϵ, ϵ)

for some

n \geq 0

&

q \in Q

.

By T1,

δ_{f} (q_{s t a r t}, ϵ, ⟂_{f}) = \{(q_{0}, ⟂_{e} ⟂_{f})\}

.

Therefore,

(q_{s t a r t}, w, ⟂_{f}) \overset{1, M_{f}}{\to} (q_{0}, w, ⟂_{e} ⟂_{f})

.

By Proposition 2.49, we have

\{(q_{0}, w, ⟂_{e}) \overset{n, M_{e}}{\to} (q, ϵ, ϵ)\} ⟺ \{(q_{0}, w, ⟂_{e}) \overset{n, M_{f}}{\to} (q, ϵ, ϵ)\}

.

By Proposition 2.50, we have

\{(q_{0}, w, ⟂_{e}) \overset{n, M_{f}}{\to} (q, ϵ, ϵ)\} ⟹ \{(q_{0}, w ϵ, ⟂_{e} ⟂_{f}) \overset{n, M_{f}}{\to} (q, ϵ ϵ, ϵ ⟂_{f})\}

.

That is,

\{(q_{0}, w, ⟂_{e}) \overset{n, M_{f}}{\to} (q, ϵ, ϵ)\} ⟹ \{(q_{0}, w, ⟂_{e} ⟂_{f}) \overset{n, M_{f}}{\to} (q, ϵ, ⟂_{f})\}

.

Also by T2,

δ_{f} (q, ϵ, ⟂_{f}) = \{(q_{a c c e p t}, ϵ)\}

.

Therefore,

(q, ϵ, ⟂_{f}) \overset{1, M_{f}}{\to} (q_{a c c e p t}, ϵ, ϵ)

.

Combining, we have

(q_{s t a r t}, w, ⟂_{f}) \overset{1, M_{f}}{\to} (q_{0}, w, ⟂_{e} ⟂_{f}) \overset{n, M_{f}}{\to} (q, ϵ, ⟂_{f}) \overset{1, M_{f}}{\to} (q_{a c c e p t}, ϵ, ϵ)

.

Therefore,

(q_{s t a r t}, w, ⟂_{f}) \overset{*, M_{f}}{\to} (q_{a c c e p t}, ϵ, ϵ)

.

Therefore,

M_{f}

accepts

w

.

w \in L (M_{f})

Therefore,

L (M_{e}) \subset L (M_{f})

.

Conversely, assume

w \in L (M_{f})

.

(q_{s t a r t}, w, ⟂_{f}) \overset{*, M_{f}}{\to} (q_{a c c e p t}, ϵ, γ)

for some

γ \in Γ_{f}^{*}

.

Since there exists no transition in one step to go from

q_{s t a r t}

to

q_{a c c e p t}

, there must exist configurations

(q_{1}, u_{1}, γ_{1})

,

(q_{2}, u_{2}, γ_{2})

,

\dots (q_{i}, u_{i}, γ_{i})

,

\dots (q_{n}, u_{n}, γ_{n})

where

n \geq 1

,

u_{i} \in Σ^{*}

,

γ_{i}

\in Γ_{f}^{*}

for

1 \leq i \leq n

, such that

(q_{s t a r t}, w, ⟂_{f}) \overset{1, M_{f}}{\to} (q_{1}, u_{1}, γ_{1}) \overset{1, M_{f}}{\to} \dots \overset{1, M_{f}}{\to} (q_{i}, u_{i}, γ_{i}) \overset{1, M_{f}}{\to} \dots \overset{1, M_{f}}{\to} (q_{n}, u_{n}, γ_{n}) \overset{1, M_{f}}{\to}

(q_{a c c e p t}, ϵ, γ)

.

Note that

q_{i} \neq q_{s t a r t}

because each

q_{i}

has both incoming and outgoing arrows whereas

q_{s t a r t}

has only outgoing arrows and

q_{i} \neq q_{a c c e p t}

because

q_{a c c e p t}

has only incoming arrows.

Therefore, for

1 \leq i \leq n

,

q_{i} \in Q

.

Claim 1.

(q_{1}, u_{1}, γ_{1}) = (q_{0}, w, ⟂_{e} ⟂_{f})

.

δ_{f} (q_{s t a r t}, ϵ, ⟂_{f}) = \{(q_{0}, ⟂_{e} ⟂_{f})\}

by T1.

Therefore,

(q_{s t a r t}, w, ⟂_{f}) \overset{1, M_{f}}{\to} (q_{0}, w, ⟂_{e} ⟂_{f})

.

Since

(q_{s t a r t}, w, ⟂_{f}) \overset{1, M_{f}}{\to} (q_{1}, u_{1}, γ_{1})

, and by T4,

δ_{f} (q_{s t a r t}, a, A) = \emptyset

for any other combination of

(a, A) \neq (ϵ, ⟂_{f})

, we must have

(q_{1}, u_{1}, γ_{1}) = (q_{0}, w, ⟂_{e} ⟂_{f})

.

Claim 2. For

1 \leq i \leq n

,

\exists γ_{i}^{'} \in Γ^{*}

such that

γ_{i} = γ_{i}^{'} ⟂_{f}

.

Claim 2 can be proved by induction on

i

.

For

i = 1

,

(q_{1}, u_{1}, γ_{1}) = (q_{0}, w, ⟂_{e} ⟂_{f})

(By Claim 1)

Therefore,

γ_{1} = ⟂_{e} ⟂_{f}

.

Take

γ_{1}^{'} = ⟂_{e}

.

γ_{1} = γ_{1}^{'} ⟂_{f}

.

⟂_{e} \in Γ ⟹ ⟂_{e} \in Γ^{*} ⟹ γ_{1}^{'} \in Γ^{*}

.

The statement is true for

i = 1

.

For induction hypothesis (

i = k)

, assume

γ_{k} = γ_{k}^{'} ⟂_{f}

for

1 \leq k \leq n - 1

,

γ_{k}^{'} \in Γ^{*}

.

Consider configuration move of

(q_{k}, u_{k}, γ_{k}) \overset{1, M_{f}}{\to} (q_{k + 1}, u_{k + 1}, γ_{k + 1})

which is equivalent to

q_{k} \overset{a, b ⟶ c, δ_{f}}{\to} q_{k + 1}

where

a \in Σ_{ϵ}

,

b \in {(Γ_{f})}_{ϵ}

,

c \in Γ_{f}^{*}

,

u_{k} = a u_{k + 1}

,

γ_{k} = b γ_{k}^{"}

,

γ_{k + 1} = c γ_{k}^{"}

,

γ_{k}^{"} \in Γ_{f}^{*}

.

Since

1 \leq k < k + 1 \leq n

,

q_{k}, q_{k + 1} \in Q

.

This configuration move could not have come from T1 because

q_{k} \neq q_{s t a r t}

.

By induction hypothesis,

γ_{k} = γ_{k}^{'} ⟂_{f}

&

γ_{k}^{'} \in Γ^{*}

.

We examine two situations: (i)

γ_{k}^{'} = ϵ

and (ii)

γ_{k}^{'} \neq ϵ

.

(i) If

γ_{k}^{'} = ϵ

γ_{k} = ⟂_{f}

b = ϵ

or

b = ⟂_{f}

.

If

b = ⟂_{f}

,

q_{k} \overset{a, ⟂_{f} ⟶ c, δ_{f}}{\to} q_{k + 1}

.

This transition must have come from T2 where

δ_{f} (q_{k}, ϵ, ⟂_{f}) = \{(q_{a c c e p t}, ϵ)\}

,

a = ϵ

,

c = ϵ

.

Therefore,

q_{k + 1} = q_{a c c e p t}

, which contradicts

q_{k + 1} \in Q

.

Therefore,

b = ϵ

.

Therefore,

(q_{k}, a, b) = (q_{k}, a, ϵ) \in Q \times Σ_{ϵ} \times Γ_{ϵ}

.

By T3,

δ_{f} (q_{k}, a, b) = δ (q_{k}, a, b)

.

Therefore,

q_{k} \overset{a, ϵ ⟶ c, δ}{\to} q_{k + 1}

.

Therefore,

c \in Γ^{*}

.

γ_{k} = ⟂_{f} = b γ_{k}^{"}

.

Since

b = ϵ

,

γ_{k}^{"} = ⟂_{f}

.

Therefore,

γ_{k + 1} = c γ_{k}^{"} = c ⟂_{f}

.

The statement is true for

i = k + 1

.

(ii) If

γ_{k}^{'} \neq ϵ

Since

γ_{k}^{'} \in Γ^{*}

,

⟂_{f}

is not a symbol in

γ_{k}^{'}

.

Since

γ_{k} = γ_{k}^{'} ⟂_{f}

, the leftmost symbol of

γ_{k}

cannot be

⟂_{f}

.

Therefore, the configuration move of

(q_{k}, u_{k}, γ_{k}) \overset{1, M_{f}}{\to} (q_{k + 1}, u_{k + 1}, γ_{k + 1})

could not have come from T2.

Therefore, it must have come from T3 where

δ = δ_{f}

.

Therefore,

(q_{k}, u_{k}, γ_{k}) \overset{1, M_{e}}{\to} (q_{k + 1}, u_{k + 1}, γ_{k + 1})

by Proposition 2.49.

Therefore,

q_{k} \overset{a, b ⟶ c, δ}{\to} q_{k + 1}

where

a \in Σ_{ϵ}

,

b \in Γ_{ϵ}

,

c \in Γ^{*}

,

u_{k} = a u_{k + 1}

,

γ_{k} = b γ_{k}^{"}

,

γ_{k + 1} = c γ_{k}^{"}

,

γ_{k}^{"} \in Γ_{f}^{*}

.

Note that

b \neq ⟂_{f}

because

⟂_{f} \notin Γ_{ϵ}

.

By induction hypothesis,

γ_{k} = γ_{k}^{'} ⟂_{f}

&

γ_{k}^{'} \in Γ^{*}

.

Therefore,

{γ_{k} = γ}_{k}^{'} ⟂_{f} = b γ_{k}^{"}

.

γ_{k}^{"} = ϵ ⟹ γ_{k}^{'} ⟂_{f} = b ⟹ γ_{k}^{'} ⟂_{f} \in Γ_{ϵ}

, which is a contradiction because

⟂_{f} \notin Γ_{ϵ}

.

Therefore,

γ_{k}^{"} \neq ϵ

.

The rightmost symbol of

γ_{k}^{"}

must be

⟂_{f}

.

\exists γ_{k}^{'''} \in Γ_{f}^{*}

such that

γ_{k}^{"} = γ_{k}^{'''} ⟂_{f}

.

Therefore,

γ_{k}^{'} ⟂_{f} = b γ_{k}^{'''} ⟂_{f}

.

Therefore,

γ_{k}^{'} = b γ_{k}^{'''}

.

Since

γ_{k}^{'} \in Γ^{*}

by induction hypothesis,

γ_{k}^{'''} {\in Γ}^{*}

.

γ_{k + 1} = c γ_{k}^{"} = c γ_{k}^{'''} ⟂_{f} = γ_{k + 1}^{'} ⟂_{f}

where

γ_{k + 1}^{'} = c γ_{k}^{'''}

.

Since

c \in Γ^{*}

&

γ_{k}^{'''} {\in Γ}^{*}

,

γ_{k + 1}^{'} \in Γ^{*}

.

The statement is also true for

i = k + 1

.

Therefore, the statement is true for

i = k + 1

whether or not

γ_{k}^{'} = ϵ

.

This completes the proof of Claim 2.

Claim 3.

(q_{n}, u_{n}, γ_{n}) = (q_{n}, ϵ, ⟂_{f})

.

We know from above that

(q_{n}, u_{n}, γ_{n}) \overset{1, M_{f}}{\to}

(q_{a c c e p t}, ϵ, γ)

.

The only way to transition from a state in

Q

to

q_{a c c e p t}

is via T2 where

δ_{f} (q_{n}, ϵ, ⟂_{f}) = \{(q_{a c c e p t}, ϵ)\}

.

Equivalently,

q_{n} \overset{ϵ, ⟂_{f} ⟶ ϵ, δ_{f}}{\to} q_{a c c e p t}

.

Therefore,

(q_{n}, u_{n}, γ_{n}) \overset{1, M_{f}}{\to}

(q_{a c c e p t}, u_{n}, γ_{n}^{"})

where

γ_{n} = ⟂_{f} γ_{n}^{"}

&

γ_{n}^{"} \in Γ_{f}^{*}

.

Therefore,

(q_{a c c e p t}, u_{n}, γ_{n}^{"}) = (q_{a c c e p t}, ϵ, γ)

.

Therefore,

u_{n} = ϵ

&

γ = γ_{n}^{"}

.

By Claim 2,

γ_{n} = γ_{n}^{'} ⟂_{f}

where

γ_{n}^{'} {\in Γ}^{*}

.

Therefore,

γ_{n} = γ_{n}^{'} ⟂_{f} = ⟂_{f} γ_{n}^{"}

.

If

γ_{n}^{"} \neq ϵ

, its rightmost symbol must be

⟂_{f}

.

Let

γ_{n}^{"} = γ_{n}^{'''} ⟂_{f}

for some

γ_{n}^{'''} \in Γ_{f}^{*}

.

Therefore,

γ_{n}^{'} ⟂_{f} = ⟂_{f} γ_{n}^{'''} ⟂_{f}

.

Therefore,

γ_{n}^{'} = ⟂_{f} γ_{n}^{'''}

, which is a contradiction because by Claim 2,

γ_{n}^{'} {\in Γ}^{*}

but

⟂_{f}

is not in

Γ^{*}

.

Therefore,

γ_{n}^{"} = ϵ

.

γ = γ_{n}^{"} = ϵ

.

γ_{n} = ⟂_{f} γ_{n}^{"} = ⟂_{f}

.

(q_{n}, u_{n}, γ_{n}) = (q_{n}, ϵ, ⟂_{f})

and Claim 3 is proved.

Claim 4.

(q_{i}, u_{i}, γ_{i}^{'}) \overset{1, M_{e}}{\to} (q_{i + 1}, u_{i + 1}, γ_{i + 1}^{'})

for

1 \leq i \leq n - 1

.

From above, we have

(q_{i}, u_{i}, γ_{i}) \overset{1, M_{f}}{\to} (q_{i + 1}, u_{i + 1}, γ_{i + 1})

.

By Claim 2,

(q_{i}, u_{i}, γ_{i}^{'} ⟂_{f}) \overset{1, M_{f}}{\to} (q_{i + 1}, u_{i + 1}, γ_{i + 1}^{'} ⟂_{f})

where

γ_{i}^{'} {\in Γ}^{*}

,

γ_{i + 1}^{'} {\in Γ}^{*}

,

a {\in Σ}_{ϵ}

,

u_{i} = a u_{i + 1}

,

u_{i}

,

u_{i + 1} \in Σ^{*}

.

Equivalently,

q_{i} \overset{a, b ⟶ c, δ_{f}}{\to} q_{i + 1}

where

γ_{i}^{'} ⟂_{f} = b η

&

γ_{i + 1}^{'} ⟂_{f} = c η

,

b \in {(Γ_{f})}_{ϵ}

,

c \in Γ_{f}^{*}

,

η \in Γ_{f}^{*}

.

Since

q_{i} \neq q_{s t a r t}

, the above computation could not have come from T1.

Since

q_{i + 1} \neq q_{a c c e p t}

, the above computation could not have come from T2.

Therefore, it must have come from T3 where

δ_{f} (q_{i}, a, b) = δ (q_{i}, a, b)

.

Therefore,

b \in Γ_{ϵ}

&

c {\in Γ}^{*}

and

q_{i} \overset{a, b ⟶ c, δ}{\to} q_{i + 1}

.

Since

γ_{i}^{'} ⟂_{f} = b η

, the rightmost symbol of

η

must be

⟂_{f}

.

Therefore,

η = θ ⟂_{f}

for some

θ \in Γ_{f}^{*}

.

Therefore,

γ_{i}^{'} ⟂_{f} = b θ ⟂_{f}

and

γ_{i + 1}^{'} ⟂_{f} = c θ ⟂_{f}

.

Therefore,

γ_{i}^{'} = b θ

and

γ_{i + 1}^{'} = c θ

.

Since

γ_{i}^{'} {\in Γ}^{*}

by Claim 2,

θ {\in Γ}^{*}

.

Therefore,

(q_{i}, u_{i}, γ_{i}^{'}) \overset{1, M_{e}}{\to} (q_{i + 1}, u_{i + 1}, γ_{i + 1}^{'})

.

This completes the proof of Claim 4.

By Claim 4, we now have

(q_{1}, u_{1}, γ_{1}^{'}) \overset{1, M_{e}}{\to} (q_{2}, u_{2}, γ_{2}^{'}) \overset{1, M_{e}}{\to} \dots \overset{1, M_{e}}{\to} (q_{n - 1}, u_{n - 1}, γ_{n - 1}^{'}) \overset{1, M_{f}}{\to} (q_{n}, u_{n}, γ_{n}^{'})

.

By Claim 1,

(q_{1}, u_{1}, γ_{1}) = (q_{0}, w, ⟂_{e} ⟂_{f})

.

Therefore,

q_{1} = q_{0}

,

u_{1} = w

,

γ_{1} = ⟂_{e} ⟂_{f}

.

By Claim 2,

γ_{1} = γ_{1}^{'} ⟂_{f}

.

Therefore,

γ_{1}^{'} ⟂_{f} = ⟂_{e} ⟂_{f}

.

Therefore,

γ_{1}^{'} = ⟂_{e}

.

Therefore,

(q_{1}, u_{1}, γ_{1}^{'}) = (q_{0}, w, ⟂_{e})

.

By Claim 3,

(q_{n}, u_{n}, γ_{n}) = (q_{n}, ϵ, ⟂_{f})

.

u_{n} = ϵ

&

γ_{n} = ⟂_{f}

.

By Claim 2,

γ_{n} = γ_{n}^{'} ⟂_{f}

.

Therefore,

γ_{n}^{'} ⟂_{f} = ⟂_{f}

.

Therefore,

γ_{n}^{'} = ϵ

.

(q_{n}, u_{n}, γ_{n}^{'}) = (q_{n}, ϵ, ϵ)

.

Therefore,

(q_{0}, w, ⟂_{e}) \overset{1, M_{e}}{\to} (q_{2}, u_{2}, γ_{2}^{'}) \overset{1, M_{e}}{\to} \dots \overset{1, M_{e}}{\to} (q_{n - 1}, u_{n - 1}, γ_{n - 1}^{'}) \overset{1, M_{f}}{\to} (q_{n}, ϵ, ϵ)

.

(q_{0}, w, ⟂_{e}) \overset{*, M_{e}}{\to} (q_{n}, ϵ, ϵ)

.

M_{e}

accepts

w

.

w \in L (M_{e})

.

w \in L (M_{e}) \Leftrightarrow w \in L (M_{f})

.

This completes the proof of Lemma 2.52.

Lemma 2.53. For any

P D A

,

M_{f}

, that accepts by final state, there is a

P D A

,

M_{e}

, that accepts by empty stack such that

L (M_{e}) = L (M_{f})

.

Proof. Let

M_{f} = (Q, Σ, Γ, δ, q_{0}, ⟂_{f}, F)

where

⟂_{f} \in Γ

is the initial stack symbol of

M_{f}

.

Construct

M_{e} = (Q_{e}, Σ, Γ_{e}, δ_{e}, q_{s t a r t}, ⟂_{e})

where

Q_{e} = Q \cup {q_{s t a r t}, q_{e m p t y}}

;

Γ_{e} = Γ \cup ⟂_{e}

;

q_{s t a r t}

and

q_{e m p t y}

are newly created states (not in

Q

) with

q_{s t a r t}

serving as the start state of

M_{e}

and

q_{e m p t y}

serving as the state in which

M_{e}

begins the process of emptying the stack (without further consuming input);

⟂_{e}

is a newly created stack symbol (not in

Γ

) serving as the initial stack symbol of

M_{e}

.

The transition function

δ_{e}

of

M_{e}

is defined as follows.

T1:

δ_{e} (q_{s t a r t}, ϵ, ⟂_{e}) = \{(q_{0}, ⟂_{f} ⟂_{e})\}

(⟺ q_{s t a r t} \overset{ϵ, ⟂_{e} ⟶ ⟂_{f} ⟂_{e}, δ_{e}}{\to} q_{0})

T2:

\forall

(q, a, A) \in Q \times Σ_{ϵ} \times Γ_{ϵ}

where

δ : Q \times Σ_{ϵ} \times Γ_{ϵ} ⟶ ℘ (Q \times Γ^{*})

,

δ_{e} (q, a, A) = δ (q, a, A)

.

T3:

\forall

q \in F

,

δ_{e} (q, ϵ, ϵ) = \{(q_{e m p t y}, ϵ)\}

(⟺ q \overset{ϵ, ϵ ⟶ ϵ, δ_{e}}{\to} q_{e m p t y})

.

T4:

\forall A \in {(Γ_{e})}_{ϵ}

,

δ_{e} (q_{e m p t y}, ϵ, A) = \{(q_{e m p t y}, ϵ)\}

(⟺ q_{e m p t y} \overset{ϵ, A ⟶ ϵ, δ_{e}}{\to} q_{e m p t y})

.

T5: For any other

(q, a, A) \in Q_{e} \times Σ_{ϵ} \times {(Γ_{e})}_{ϵ}

,

δ_{e} (q, a, A) = \emptyset

.

The construction is now complete. It remains to show

L (M_{e}) = L (M_{f})

.

Suppose

w \in L (M_{f})

.

(q_{0}, w, ⟂_{f}) \overset{n, M_{f}}{\to} (q, ϵ, γ)

for some

n \geq 0

&

q \in F

,

γ \in Γ^{*}

.

By T1,

δ_{e} (q_{s t a r t}, ϵ, ⟂_{e}) = \{(q_{0}, ⟂_{f} ⟂_{e})\}

.

Therefore,

(q_{s t a r t}, w, ⟂_{e}) \overset{1, M_{e}}{\to} (q_{0}, w, ⟂_{f} ⟂_{e})

.

By Proposition 2.49, we have

\{(q_{0}, w, ⟂_{f}) \overset{n, M_{f}}{\to} (q, ϵ, γ)\} ⟺ \{(q_{0}, w, ⟂_{f}) \overset{n, M_{e}}{\to} (q, ϵ, γ)\}

. (

Q \subset Q_{e} & δ_{e} (q, a, A) = δ (q, a, A)

)

By Proposition 2.50, we have

\{(q_{0}, w, ⟂_{f}) \overset{n, M_{e}}{\to} (q, ϵ, γ)\} ⟹ \{(q_{0}, w ϵ, ⟂_{f} ⟂_{e}) \overset{n, M_{e}}{\to} (q, ϵ ϵ, {γ ⟂}_{e})\}

.

That is,

\{(q_{0}, w, ⟂_{f}) \overset{n, M_{e}}{\to} (q, ϵ, γ)\} ⟹ \{(q_{0}, w, ⟂_{f} ⟂_{e}) \overset{n, M_{e}}{\to} (q, ϵ, {γ ⟂}_{e})\}

.

Therefore,

(q_{s t a r t}, w, ⟂_{e}) \overset{1, M_{e}}{\to} (q_{0}, w, ⟂_{f} ⟂_{e}) \overset{n, M_{e}}{\to} (q, ϵ, {γ ⟂}_{e})

.

Therefore,

(q_{s t a r t}, w, ⟂_{e}) \overset{*, M_{e}}{\to} (q, ϵ, {γ ⟂}_{e})

.

By T3,

(q, ϵ, {γ ⟂}_{e}) \overset{1, M_{e}}{\to} (q_{e m p t y}, ϵ, γ ⟂_{e})

.

By repeated application of T4,

(q_{e m p t y}, ϵ, γ ⟂_{e}) \overset{*, M_{e}}{\to} (q_{e m p t y}, ϵ, ϵ)

.

Combined,

(q_{s t a r t}, w, ⟂_{e}) \overset{*, M_{e}}{\to} (q, ϵ, {γ ⟂}_{e}) \overset{1, M_{e}}{\to} (q_{e m p t y}, ϵ, γ ⟂_{e}) \overset{*, M_{e}}{\to} (q_{e m p t y}, ϵ, ϵ)

.

Therefore,

(q_{s t a r t}, w, ⟂_{e}) \overset{*, M_{e}}{\to} (q_{e m p t y}, ϵ, ϵ)

.

Therefore,

M_{e}

accepts

w

.

w \in L (M_{e})

Therefore,

L (M_{f}) \subset L (M_{e})

.

Conversely, assume

w \in L (M_{e})

.

There exist configurations

(q_{1}, u_{1}, γ_{1})

,

(q_{2}, u_{2}, γ_{2})

,

\dots (q_{i}, u_{i}, γ_{i})

,

\dots (q_{n}, u_{n}, γ_{n})

where

n \geq 0

,

u_{i} \in Σ^{*}

,

γ_{i}

\in Γ_{e}^{*}

for

1 \leq i \leq n

,

q_{1}, q_{1} \dots q_{n} \in Q_{e}

such that

(q_{s t a r t}, w, ⟂_{e}) \overset{1, M_{e}}{\to} (q_{1}, u_{1}, γ_{1}) \overset{1, M_{e}}{\to} \dots \overset{1, M_{e}}{\to} (q_{i}, u_{i}, γ_{i}) \overset{1, M_{e}}{\to} \dots \overset{1, M_{e}}{\to} (q_{n}, u_{n}, γ_{n}) \overset{1, M_{e}}{\to}

(q, ϵ, ϵ)

.

Note that

q_{s t a r t} \notin {q_{1}, q_{1} \dots q_{n}}

because

q_{s t a r t}

doesn’t have incoming arrows.

If

n = 0

,

(q_{s t a r t}, w, ⟂_{e}) \overset{1, M_{e}}{\to} (q, ϵ, ϵ)

.

By T1,

(q_{s t a r t}, w, ⟂_{e}) \overset{1, M_{e}}{\to} (q_{0}, w, ⟂_{f} ⟂_{e})

and this is the only configuration move out of

(q_{s t a r t}, w, ⟂_{e})

because

δ_{e} (q_{s t a r t}, ϵ, ⟂_{e}) = \{(q_{0}, ⟂_{f} ⟂_{e})\}

.

Therefore,

(q, ϵ, ϵ) = (q_{0}, w, ⟂_{f} ⟂_{e})

, which is a contradiction because

⟂_{f} ⟂_{e} \neq ϵ

.

Therefore,

n \geq 1

.

In addition, we have

(q_{1}, u_{1}, γ_{1}) = (q_{0}, w, ⟂_{f} ⟂_{e})

.

To move from

(q_{0}, w, ⟂_{f} ⟂_{e})

to the final configuration

(q, ϵ, ϵ)

,

M_{e}

must pop

⟂_{e}

at some point which can only be done by T4 because T1 does not pop

⟂_{e}

and T2 and T3 cannot move on

⟂_{e}

.

Therefore, we must have a

q_{e m p t y}

somewhere between

(q_{0}, w, ⟂_{f} ⟂_{e})

and

(q, ϵ, ϵ)

.

However, we can only transition into

q_{e m p t y}

via a state

p \in F

(T3).

Therefore, there must be a

p \in F

somewhere between

(q_{0}, w, ⟂_{f} ⟂_{e})

and

(q, ϵ, ϵ)

.

\exists m =

Max

{i | 1 \leq i \leq n; q_{i} \in F}

.

Claim 1. For

1 \leq i \leq m

,

q_{i} \neq q_{e m p t y}

and hence

γ_{i}

has

⟂_{e}

as its rightmost symbol.

To prove Claim 1, we assume for contradiction

\exists k

such that

1 \leq k \leq m

&

q_{k} = q_{e m p t y}

.

By T4,

q_{k + 1} = q_{k + 2} = \dots = q_{m} = q_{e m p t y}

.

This contradicts

q_{m} \in F

.

Therefore,

q_{i} \neq q_{e m p t y}

for

1 \leq i \leq m

.

As mentioned above,

(q_{1}, u_{1}, γ_{1}) = (q_{0}, w, ⟂_{f} ⟂_{e})

.

Therefore,

γ_{1} = ⟂_{f} ⟂_{e}

.

Since the only way to pop

⟂_{e}

is by using T4 that requires the existence of

q_{e m p t y}

and we cannot have any

q_{e m p t y}

between

q_{1}

and

q_{m}

, we must conclude that

⟂_{e}

remains sitting at the bottom of the stack as the machine moves from

q_{1}

to

q_{m} .

Therefore,

γ_{i}

has

⟂_{e}

as its rightmost symbol for

1 \leq i \leq m

.

Claim 2. For configuration

(q_{m}, u_{m}, γ_{m})

,

u_{m} = ϵ

.

To prove Claim 2, we assume for contradiction

u_{m} \neq ϵ

.

At this configuration, there are two possible ways for the machine to move: (i) is to continue to simulate

M_{f}

using T2 and (ii) is to enter

q_{e m p t y}

using T3.

For (i), the machine will continue to read the input but will never enter an accept state again because it has passed

q_{m}

which is the highest accept state in this computation. By the time the machine completes reading the entire input it will come to a stop and has never had a chance to enter the state

q_{e m p t y}

. Thus,

⟂_{e}

remains sitting in the stack when everything comes to stop. This is contradictory to the assumption that the computation ends at

(q, ϵ, ϵ)

.

For (ii), the machine enters

q_{e m p t y}

using T3 transition:

q_{m} \overset{ϵ, ϵ ⟶ ϵ, δ_{e}}{\to} q_{e m p t y}

.

q_{m + 1} = q_{e m p t y}

,

u_{m + 1} = u_{m}

,

γ_{m + 1} = γ_{m}

.

(q_{m}, u_{m}, γ_{m}) \overset{1, M_{e}}{\to} (q_{e m p t y}, u_{m}, γ_{m})

Once the machine has entered

q_{e m p t y}

, it will follow T4, which is

q_{e m p t y} \overset{ϵ, A ⟶ ϵ, δ_{e}}{\to} q_{e m p t y}

, to continue to pop symbols from the stack while remaining in

q_{e m p t y}

and not reading any input.

Therefore,

q_{m + 1} = q_{m + 2} = \dots = q_{n} = q = q_{e m p t y}

&

u_{m} = u_{m + 1} = u_{m + 2} = \dots = u_{n} = ϵ

(Last configuration is

(q, ϵ, ϵ)

).

Therefore, both (i) and (ii) contradict the original assumption that

u_{m} \neq ϵ

.

Therefore,

u_{m} = ϵ

.

Claim 3.

For

1 \leq i \leq m

,

\exists γ_{i}^{'} \in Γ^{*}

such that

γ_{i} = γ_{i}^{'} ⟂_{e}

.

The proof of Claim 3 is by induction on

i

.

We show at the beginning that

(q_{1}, u_{1}, γ_{1}) = (q_{0}, w, ⟂_{f} ⟂_{e})

Therefore,

γ_{1} = ⟂_{f} ⟂_{e}

.

Since

⟂_{f} \in Γ \subset Γ^{*}

,

⟂_{f} \in Γ^{*}

.

If we take

γ_{1}^{'} = ⟂_{f}

,

γ_{1} = {γ_{1}^{'} ⟂}_{e}

.

The statement is true for

i = 1

.

For induction hypothesis,

γ_{k} = γ_{k}^{'} ⟂_{e}

for

1 \leq k \leq m - 1

,

γ_{k}^{'} \in Γ^{*}

.

We show at the beginning that

q_{s t a r t} \notin {q_{1}, q_{2} \dots q_{n}}

&

q_{e m p t y} \notin {q_{1}, q_{2} \dots q_{m}}

by Claim 1.

Therefore,

q_{1}, q_{1} \dots q_{m} \in Q

.

Consider configuration move of

(q_{k}, u_{k}, γ_{k}) \overset{1, M_{e}}{\to} (q_{k + 1}, u_{k + 1}, γ_{k + 1})

.

This move could not have come from T3 or T4 because

q_{k} \neq q_{e m p t y}

&

q_{k + 1} \neq q_{e m p t y}

.

The move must have come from T2 where

δ_{e} = δ

.

By Proposition 2.49, we have

(q_{k}, u_{k}, γ_{k}) \overset{1, M_{f}}{\to} (q_{k + 1}, u_{k + 1}, γ_{k + 1})

.

Therefore,

q_{k} \overset{a, b ⟶ c, δ}{\to} q_{k + 1}

where

a \in Σ_{ϵ}

,

b \in Γ_{ϵ}

,

c \in Γ^{*}

,

u_{k} = a u_{k + 1}

,

γ_{k} = b γ_{k}^{"}

,

γ_{k + 1} = c γ_{k}^{"}

,

γ_{k}^{"} \in Γ_{e}^{*}

.

Note that

b \in Γ_{ϵ} ⟹ b \neq ⟂_{e}

.

By induction hypothesis,

γ_{k} = γ_{k}^{'} ⟂_{e}

&

γ_{k}^{'} \in Γ^{*}

.

Therefore,

γ_{k}^{'} ⟂_{e} = b γ_{k}^{"}

.

γ_{k}^{"} = ϵ ⟹ γ_{k}^{'} ⟂_{e} = b ⟹ (γ_{k}^{'} = ϵ) & (b = ⟂_{e}) ⟹

Contradiction.

Therefore,

γ_{k}^{"} \neq ϵ

.

The rightmost symbol of

γ_{k}^{"}

must be

⟂_{e}

(because

γ_{k}^{'} ⟂_{e} = b γ_{k}^{"}

)

\exists γ_{k}^{'''} \in Γ_{e}^{*}

such that

γ_{k}^{"} = γ_{k}^{'''} ⟂_{e}

.

Therefore,

γ_{k}^{'} ⟂_{e} = b γ_{k}^{'''} ⟂_{e}

.

Therefore,

γ_{k}^{'} = b γ_{k}^{'''}

.

Since

γ_{k}^{'} \in Γ^{*}

by induction hypothesis,

γ_{k}^{'''} {\in Γ}^{*}

.

γ_{k + 1} = c γ_{k}^{"} = c γ_{k}^{'''} ⟂_{e} = γ_{k + 1}^{'} ⟂_{e}

where

γ_{k + 1}^{'} = c γ_{k}^{'''}

.

Since

c \in Γ^{*}

&

γ_{k}^{'''} {\in Γ}^{*}

,

γ_{k + 1}^{'} \in Γ^{*}

.

The statement is also true for

i = k + 1

.

This completes the proof of Claim 3.

Claim 4.

(q_{i}, u_{i}, γ_{i}^{'}) \overset{1, M_{f}}{\to} (q_{i + 1}, u_{i + 1}, γ_{i + 1}^{'})

for

1 \leq i \leq m - 1

.

By assumption,

(q_{i}, u_{i}, γ_{i}) \overset{1, M_{e}}{\to} (q_{i + 1}, u_{i + 1}, γ_{i + 1})

.

By Claim 3,

(q_{i}, u_{i}, γ_{i}^{'} ⟂_{e}) \overset{1, M_{e}}{\to} (q_{i + 1}, u_{i + 1}, γ_{i + 1}^{'} ⟂_{e})

By Claim 1,

q_{i} \neq q_{e m p t y}

&

q_{i + 1} \neq q_{e m p t y}

for

1 \leq i \leq m - 1

.

Also, we point out at the beginning that

q_{i} \neq q_{s t a r t}

for

1 \leq i \leq n

.

Therefore, this computation must have come from T2 where

δ_{e} = δ

.

By Proposition 2.49,

(q_{i}, u_{i}, γ_{i}^{'} ⟂_{e}) \overset{1, M_{f}}{\to} (q_{i + 1}, u_{i + 1}, γ_{i + 1}^{'} ⟂_{e})

.

Equivalently,

q_{i} \overset{a, b ⟶ c, δ}{\to} q_{i + 1}

where

a {\in Σ}_{ϵ}

,

b \in Γ_{ϵ} {, c {\in Γ}^{*}, γ}_{i}^{'} ⟂_{e} = b η

&

γ_{i + 1}^{'} ⟂_{e} = c η

,

η \in Γ_{e}^{*}

,

u_{i} = a u_{i + 1}

.

Note that

b \in Γ_{ϵ} ⟹ b \neq ⟂_{e}

.

η = ϵ ⟹ γ_{i}^{'} ⟂_{e} = b ⟹ (γ_{i}^{'} = ϵ) & (b = ⟂_{e}) ⟹

Contradiction.

Therefore,

η \neq ϵ

.

The rightmost symbol of

η

must be

⟂_{e}

(because

γ_{i}^{'} ⟂_{e} = b η

).

Let

η = θ ⟂_{e}

where

θ \in Γ_{e}^{*}

.

Therefore,

γ_{i}^{'} ⟂_{e} = b θ ⟂_{e}

&

γ_{i + 1}^{'} ⟂_{e} = c θ ⟂_{e}

.

Therefore,

γ_{i}^{'} = b θ

&

γ_{i + 1}^{'} = c θ

.

Since

γ_{i}^{'} {\in Γ}^{*}

by Claim 3,

θ {\in Γ}^{*}

.

c {\in Γ}^{*}

&

θ {\in Γ}^{*} ⟹ γ_{i + 1}^{'} {\in Γ}^{*}

.

Therefore,

(q_{i}, u_{i}, γ_{i}^{'}) \overset{1, M_{f}}{\to} (q_{i + 1}, u_{i + 1}, γ_{i + 1}^{'})

.

This completes the proof of Claim 4.

By Claim 4, we now have,

(q_{1}, u_{1}, γ_{1}^{'}) \overset{1, M_{f}}{\to} (q_{2}, u_{2}, γ_{2}^{'}) \overset{1, M_{f}}{\to} \dots \overset{1, M_{f}}{\to} (q_{m - 1}, u_{m - 1}, γ_{m - 1}^{'}) \overset{1, M_{f}}{\to} (q_{m}, u_{m}, γ_{m}^{'})

.

(q_{1}, u_{1}, γ_{1}) = (q_{0}, w, ⟂_{f} ⟂_{e})

as shown at the beginning.

Therefore,

γ_{1} = ⟂_{f} ⟂_{e}

.

By Claim 3,

γ_{1} = γ_{1}^{'} ⟂_{e}

.

Therefore,

γ_{1}^{'} = ⟂_{f}

.

Therefore,

(q_{1}, u_{1}, γ_{1}^{'}) = (q_{0}, w, ⟂_{f})

.

By definition,

q_{m} \in F

.

By Claim 2,

u_{m} = ϵ

.

Therefore,

(q_{m}, u_{m}, γ_{m}^{'}) = (q_{m}, ϵ, γ_{m}^{'})

.

Therefore,

(q_{0}, w, ⟂_{f}) \overset{*, M_{f}}{\to} (q_{m}, ϵ, γ_{m}^{'})

where

q_{m} \in F

.

Therefore,

M_{f}

accepts

w

.

w \in L (M_{f})

.

L (M_{e}) \subset L (M_{f})

.

This completes the proof of Lemma 2.53.

Combining Lemma 2.52 and Lemma 2.53, we have the following theorem.

Theorem 2.54.

For any

P D A

,

M_{e}

, that accepts by empty stack, there is a

P D A

,

M_{f}

, that accepts by final state such that

L (M_{e}) = L (M_{f})

.

Conversely, For any

P D A

,

M_{f}

, that accepts by final state, there is a

P D A

,

M_{e}

, that accepts by empty stack such that

L (M_{e}) = L (M_{f})

.

2.4. Equivalence of $C F G$

and

P D A

In this section, we shall prove that context-free grammars and pushdown automata are equivalent in power in that any language that is context-free is recognized by a pushdown automata and vice versa.

Definition 2.54.

Let

G = (V, Σ, R, S)

be a

C F G

.

Let

A \in V

,

y \in {(V \cup Σ)}^{*}

.

A

is called the leftmost variable in

y

iff

\exists x \in Σ^{*}

and

α \in {(V \cup Σ)}^{*}

such that

y = x A α

.

x

is called the head of

y

(written as

x = H e a d (y)

),

A α

is called the body of

y

(written as

A α = B o d y (y)

) and

α

is called the tail of

y

(written as

α = T a i l (y)

).

It is clear from this definition that

y = H e a d (y) A T a i l (y) = H e a d (y) B o d y (y)

and if

y \in Σ^{*}

, then

H e a d (y) = y

and

B o d y (y) = T a i l (y) = ϵ

.

Definition 2.55. Let

G = (V, Σ, R, S)

be a

C F G

.

\forall x, y \in {(V \cup Σ)}^{*}

,

x

is a prefix of

y

(written as

x \leq_{P R E} y

) iff

\exists z \in {(V \cup Σ)}^{*}

such that

x z = y

.

Proposition 2.56.

\leq_{P R E}

is a reflexive and transitive relation from

{(V \cup Σ)}^{*}

to

{(V \cup Σ)}^{*}

.

Proof.

\forall x \in {(V \cup Σ)}^{*}

,

\exists ϵ \in {(V \cup Σ)}^{*}

and

x ϵ = x

.

Therefore,

x \leq_{P R E} x

and

\leq_{P R E}

is a reflexive.

\forall x, y, z \in {(V \cup Σ)}^{*}

, if

x \leq_{P R E} y

and

y \leq_{P R E} z

,

\exists x^{'}, y^{'} \in {(V \cup Σ)}^{*}

such that

x x^{'} = y

and

y y^{'} = z

.

Therefore,

x x^{'} y^{'} = z

.

Since

x^{'} y^{'} \in {(V \cup Σ)}^{*}

,

x \leq_{P R E} z

.

Therefore,

\leq_{P R E}

is transitive.

This completes the proof of Proposition 2.56.

Proposition 2.57.

Let

G = (V, Σ, R, S)

be a

C F G

.

Let

w \in Σ^{*}

,

γ_{i} \in {(V \cup Σ)}^{*}

for all

i \in {1,2, 3, \dots n}

.

Let

A_{i}

be the leftmost variable in

γ_{i}

and

A_{i} ⟶ β_{i}

be the rule such that

γ_{i} \overset{A_{i} ⟶ β_{i}, l m}{\to} γ_{i + 1}

for all

i \in {1,2, 3, \dots n}

.

Let

S = γ_{1} \overset{l m}{\Rightarrow} γ_{2} \overset{l m}{\Rightarrow} γ_{3} \dots \overset{l m}{\Rightarrow} γ_{n} = w

.

The following statements are true:

(a)

H e a d (γ_{i}) \leq_{P R E} H e a d (γ_{i + 1})

(b)

\forall 1 \leq i < j \leq n

,

H e a d (γ_{i}) \leq_{P R E} H e a d (γ_{j})

& hence

H e a d (γ_{1}) \leq_{P R E} H e a d (γ_{2}) \leq_{P R E} \dots \leq_{P R E} H e a d (γ_{n}) = w

(c)

H e a d (γ_{i + 1}) = H e a d (γ_{i}) H e a d (β_{i})

&

B o d y (γ_{i + 1}) = B o d y (β_{i}) T a i l (γ_{i})

(d) If

y_{i} \in {(V \cup Σ)}^{*}

such that

H e a d (γ_{i}) y_{i} = w

, then

H e a d (β_{i})

\leq_{P R E} y_{i}

.

Proof.

Since

γ_{i} \overset{A_{i} ⟶ β_{i}, l m}{\to} γ_{i + 1}

, &

γ_{i} = H e a d (γ_{i}) A_{i} T a i l (γ_{i})

, we have

γ_{i + 1} = H e a d (γ_{i}) β_{i} T a i l (γ_{i})

.

Therefore,

γ_{i + 1} = H e a d (γ_{i}) H e a d (β_{i}) B_{i} T a i l (β_{i}) T a i l (γ_{i})

where

B_{i}

is the leftmost variable in

β_{i}

.

Since

H e a d (γ_{i}) \in Σ^{*}

&

H e a d (β_{i}) \in Σ^{*}

,

B_{i}

is also the leftmost variable in

γ_{i + 1}

.

Therefore,

H e a d (γ_{i + 1}) = H e a d (γ_{i}) H e a d (β_{i})

.

Therefore,

H e a d (γ_{i}) \leq_{P R E} H e a d (γ_{i + 1})

.

This follows (a) to be true because

\leq_{P R E}

is transitive.

H e a d (γ_{i + 1}) = H e a d (γ_{i}) H e a d (β_{i})

is established in the proof of (a).

It is also established in the proof of (a) that

γ_{i + 1} = H e a d (γ_{i}) H e a d (β_{i}) B_{i} T a i l (β_{i}) T a i l (γ_{i})

.

B o d y (γ_{i + 1}) = B_{i} T a i l (β_{i}) T a i l (γ_{i})

.

= B o d y (β_{i}) T a i l (γ_{i})

(

B_{i}

is the leftmost variable in

β_{i}

)

From (b),

H e a d (γ_{i + 1}) \leq_{P R E} H e a d (γ_{n}) = w

.

Therefore,

\exists y_{i + 1} \in Σ^{*}

such that

H e a d (γ_{i + 1}) y_{i + 1} = w

.

Therefore,

H e a d (γ_{i + 1}) y_{i + 1} = H e a d (γ_{i}) y_{i}

.

By (c),

H e a d (γ_{i}) H e a d (β_{i}) y_{i + 1} = H e a d (γ_{i}) y_{i}

.

Therefore,

H e a d (β_{i}) y_{i + 1} = y_{i}

.

Therefore,

H e a d (β_{i}) \leq_{P R E} y_{i}

.

Lemma 2.58. For any

C F G G

,

\exists

a

P D A

M_{e}

such that

L (G) = L (M_{e})

.

Proof.

Let

G = (V, Σ, R, S)

be a

C F G

.

Construct

M_{e} = (\{q\}, Σ, V \cup Σ, δ, q, S)

where

δ

is the transition function defined as follows.

T1:

δ (q, ϵ, A) = {(q, β) | A ⟶ β

is a rule in

R}

.

T2:

δ (q, a, a) = {(q, ϵ)}

\forall a \in Σ_{ϵ}

.

T3: For all other

(q, a, A) \in \{q\} \times Σ_{ϵ} \times {(V \cup Σ)}_{ϵ}

,

δ (q, a, A) = \emptyset

.

Note that the start variable of

G

is the start stack symbol of

M_{e}

.

It remains to show

L (G) = L (M_{e})

.

To prove

L (G) \subset L (M_{e})

, suppose

w \in L (G)

.

\exists γ_{1}, γ_{2}, \dots γ_{n} \in {(V \cup Σ)}^{*}

such that

S = γ_{1} \overset{l m}{\Rightarrow} γ_{2} \overset{l m}{\Rightarrow} γ_{3} \dots γ_{i} \overset{l m}{\Rightarrow} γ_{i + 1} \dots \overset{l m}{\Rightarrow} γ_{n} = w

.

\forall i \in {1,2, 3, \dots n}

,

H e a d (γ_{i}) \leq_{P R E} w

by Proposition 2.57(b).

Therefore,

\exists y_{i}

such that

H e a d (γ_{i}) y_{i} = w

.

Claim.

\forall i \in {1,2, 3, \dots n}

,

(q, w, S) \overset{*, M_{e}}{\to} (q, y_{i}, B o d y (γ_{i}))

where

H e a d (γ_{i}) y_{i} = w

.

This Claim can be proved by induction

i

.

For

i = 1

,

S = γ_{1}

because

S = γ_{1} \overset{l m}{\Rightarrow} γ_{2} \overset{l m}{\Rightarrow} γ_{3} \dots γ_{i} \overset{l m}{\Rightarrow} γ_{i + 1} \dots \overset{l m}{\Rightarrow} γ_{n} = w

.

H e a d (γ_{1}) = H e a d (S) = ϵ

.

B o d y (γ_{1}) = B o d y (S) = S

.

(q, w, S) \overset{0, M_{e}}{\to} (q, w, S) = (q_{1}, y_{1}, α_{1})

.

Therefore,

q = q_{1}

,

y_{1} = w

&

α_{1} = S

.

Therefore,

q = q_{1}

,

H e a d (γ_{1}) y_{1} = ϵ w = w

&

α_{1} = B o d y (γ_{1})

.

(q, w, S) \overset{*, M_{e}}{\to} (q, y_{1}, B o d y (γ_{1})

.

The statement is true for

i = 1

.

For induction hypothesis, we have

(q, w, S) \overset{*, M_{e}}{\to} (q, y_{k}, B o d y (γ_{k}))

where

H e a d (γ_{k}) y_{k} = w

for

1 \leq k < n - 1

.

Let

A_{k}

be the leftmost variable in

γ_{k}

.

Since

γ_{k} \overset{l m}{\Rightarrow} γ_{k + 1}

,

\exists A_{k} ⟶ β_{k}

where

β_{k} \in {(V \cup Σ)}^{*}

.

(q, w, S) \overset{*, M_{e}}{\to} (q, y_{k}, B o d y (γ_{k}))

=

(q, y_{k}, A_{k} T a i l (γ_{k}))

\overset{1, M_{e}}{\to}

(q, y_{k}, β_{k} T a i l (γ_{k}))

(by T1)

=

(q, y_{k}, {H e a d (β}_{k}) B o d y (β_{k}) T a i l (γ_{k}))

By Proposition 2.57,

H e a d (β_{k}) \leq_{P R E} y_{k}

.

\exists y_{k + 1} \in {(V \cup Σ)}^{*}

such that

H e a d (β_{k}) y_{k + 1} = y_{k}

.

Therefore,

(q, y_{k}, {H e a d (β}_{k}) B o d y (β_{k}) T a i l (γ_{k})) = (q, H e a d (β_{k}) y_{k + 1}, {H e a d (β}_{k}) B o d y (β_{k}) T a i l (γ_{k}))

\overset{|{H e a d (β}_{k})|, M_{e}}{\to} (q, y_{k + 1}, B o d y (β_{k}) T a i l (γ_{k}))

(by repeated applications of T2

|{H e a d (β}_{k})|

times)

=

(q, y_{k + 1}, B o d y (γ_{k + 1}))

(by Proposition 2.57(c))

Therefore,

(q, w, S) \overset{*, M_{e}}{\to} (q, y_{k + 1}, B o d y (γ_{k + 1}))

.

Since

H e a d (β_{k}) y_{k + 1} = y_{k}

,

H e a d (γ_{k}) H e a d (β_{k}) y_{k + 1} = H e a d (γ_{k}) y_{k}

.

By Proposition 2.57(c),

H e a d (γ_{k + 1}) = H e a d (γ_{k}) H e a d (β_{k})

.

By induction hypothesis,

H e a d (γ_{k}) y_{k} = w

.

Therefore,

H e a d (γ_{k + 1}) y_{k + 1} = w

.

Therefore, the statement is true for

i = k + 1

.

To complete the proof of

L (G) \subset L (M_{e})

, set

i = n

in Claim.

(q, w, S) \overset{*, M_{e}}{\to} (q, y_{n}, B o d y (γ_{n}))

where

H e a d (γ_{n}) y_{n} = w

.

Since

γ_{n} = w

,

H e a d (γ_{n}) = H e a d (w) = w

&

B o d y (γ_{n}) = B o d y (w) = ϵ

.

(

H e a d (γ_{n}) = w

) & (

H e a d (γ_{n}) y_{n} = w

)

⟹ w y_{n} = w ⟹ y_{n} = ϵ

.

Therefore,

(q, w, S) \overset{*, M_{e}}{\to} (q, ϵ, ϵ)

.

w \in L (M_{e})

.

Therefore,

L (G) \subset L (M_{e})

.

(Note that at this final configuration of

(q, ϵ, ϵ)

, we could have used the transition,

δ (q, ϵ, ϵ) = {(q, ϵ)}

or

q \overset{ϵ, ϵ ⟶ ϵ, δ}{\to} q

, to loop on without stopping. However, this machine is nondeterministic, which means that we don’t have to take an option which is a bad one. On the other hand, if bad choices are made, we can loop ourselves to infinity.)

To prove

L (M_{e}) \subset L (G)

, let

w \in L (M_{e})

.

(q, w, S) \overset{*, M_{e}}{\to} (q, ϵ, ϵ)

.

Claim.

\forall x \in Σ^{*}

, if

(q, x, A) \overset{*, M_{e}}{\to} (q, ϵ, ϵ)

, then

A \overset{*}{\Rightarrow} x

.

Proof of this Claim is by induction on the number of steps.

\exists n \geq 1

, such that

(q, x, A) \overset{n, M_{e}}{\to} (q, ϵ, ϵ)

.

For

n = 1

,

(q, x, A) \overset{1, M_{e}}{\to} (q, ϵ, ϵ)

.

Since

A \in V

, we must use T1 and that is,

δ (q, ϵ, A) = {(q, β) | A ⟶ β

is a rule in

R}

.

Therefore,

(q, x, A) \overset{1, M_{e}}{\to} (q, x, β) = (q, ϵ, ϵ)

.

Therefore,

x = β = ϵ

.

Therefore,

A ⟶ ϵ

is a rule.

A ⟹ ϵ

by Proposition 2.8(i).

Therefore,

A \overset{*}{\Rightarrow} ϵ

.

The statement is true for

n = 1

.

For induction hypothesis, assume the statement is true for all

n \leq k

with

k \geq 1

.

That is, if

(q, x, A) \overset{n, M_{e}}{\to} (q, ϵ, ϵ)

, then

A \overset{*}{\Rightarrow} x

for all

n \leq k

.

For

n = k + 1

, assume

(q, x, A) \overset{k + 1, M_{e}}{\to} (q, ϵ, ϵ)

.

Since

A \in V

, the first move must be based on T1.

Therefore,

(q, x, A) \overset{1, M_{e}}{\to} (q, x, Y_{1} Y_{2} \dots Y_{m}) \overset{k, M_{e}}{\to} (q, ϵ, ϵ)

where

A ⟶ Y_{1} Y_{2} \dots Y_{m}

&

Y_{i} \in V \cup Σ

for

i \in {1,2, \dots m}

.

Since

(q, x, Y_{1} Y_{2} \dots Y_{m}) \overset{k, M_{e}}{\to} (q, ϵ, ϵ)

, the machine must pop all the

Y_{i} s

off the stack by the time it finishes reading input

x

and empties the stack.

Let

x_{i}

be the portion of

x

that the machine consumes while popping

Y_{i}

off the stack and returning its stack head to the position right before popping

Y_{i + 1}

off for

i = 1,2, 3, \dots m - 1

.

Let also

x_{m}

be the last portion of

x

that the machine consumes while popping

Y_{m}

off the stack and emptying the stack eventually.

Note that if

Y_{i}

is a terminal,

x_{i} = Y_{i}

. The

P D A

will pop

Y_{i}

using T2 and then scan the same symbol

x_{i}

from the input. The stack head will then point at

Y_{i + 1}

.

By these assumptions, we have

x = x_{1} x_{2} \dots x_{m}

.

In addition, we have the following sequence of computations:

(q, x_{1} x_{2} \dots x_{m}, Y_{1} Y_{2} \dots Y_{m}) \overset{*, M_{e}}{\to} (q, x_{2} x_{3} \dots x_{m}, Y_{2} Y_{3} \dots Y_{m}) \overset{*, M_{e}}{\to} (q, x_{3} x_{4} \dots x_{m}, Y_{3} Y_{4} \dots Y_{m})

\overset{*, M_{e}}{\to} \dots \overset{*, M_{e}}{\to} (q, x_{i} x_{i + 1} \dots x_{m}, Y_{i} Y_{i + 1} \dots Y_{m}) \overset{*, M_{e}}{\to} (q, x_{i + 1} \dots x_{m}, Y_{i + 1} \dots Y_{m})

\overset{*, M_{e}}{\to} \dots \overset{*, M_{e}}{\to} (q, x_{m}, Y_{m}) \overset{*, M_{e}}{\to} (q, ϵ, ϵ)

.

Since the stack head does not go below

Y_{i + 1}

while the

P D A

consumes

x_{i}

, we have the following equivalent computations:

(q, x_{1}, Y_{1}) \overset{*, M_{e}}{\to} (q, ϵ, ϵ)

(q, x_{2}, Y_{2}) \overset{*, M_{e}}{\to} (q, ϵ, ϵ)

⋮

(q, x_{i}, Y_{i}) \overset{*, M_{e}}{\to} (q, ϵ, ϵ)

⋮

(q, x_{m}, Y_{m}) \overset{*, M_{e}}{\to} (q, ϵ, ϵ)

Since the sum of all the numbers of steps in all these computations is equal to

k

, the number of steps in each of these computations is less than or equal to

k

.

Therefore, we can use the induction hypothesis to derive the following:

Y_{1} \overset{*}{\Rightarrow} x_{1}

Y_{2} \overset{*}{\Rightarrow} x_{2}

⋮

Y_{i} \overset{*}{\Rightarrow} x_{i}

⋮

Y_{m} \overset{*}{\Rightarrow} x_{m}

Since

A ⟶ Y_{1} Y_{2} \dots Y_{m}

,

A ⟹ Y_{1} Y_{2} \dots Y_{m}

by Proposition 2.8(i).

Since

Y_{i} \overset{*}{\Rightarrow} x_{i}

\forall i \in {1,2, \dots m}

,

Y_{1} Y_{2} \dots Y_{m} \overset{*}{\Rightarrow} x_{1} x_{2} \dots x_{m}

by Proposition 2.16(d).

Therefore,

A \overset{*}{\Rightarrow} x_{1} x_{2} \dots x_{m}

.

Therefore,

A \overset{*}{\Rightarrow} x

.

The statement is true for

n = k + 1

.

To complete the proof of

L (M_{e}) \subset L (G)

, put

A = S

&

x = w

in Claim.

[(q, w, S) \overset{*, M_{e}}{\to} (q, ϵ, ϵ)] ⟹ [S \overset{*}{\Rightarrow} w]

.

w \in L (M_{e}) ⟹ [(q, w, S) \overset{*, M_{e}}{\to} (q, ϵ, ϵ)] ⟹ [S \overset{*}{\Rightarrow} w] ⟹ w \in L (G)

.

This completes the proof of

L (M_{e}) \subset L (G)

and hence the proof of Lemma 2.58.

Lemma 2.59.

For any

P D A

M_{e}

,

\exists

a

C F G G

such that

L (G) = L (M_{e})

.

Proof.

Let

M_{e} = (Q, Σ, Γ, δ, q_{0}, ⟂_{e})

be a

P D A

that accepts by empty stack.

Construct

C F G

G = (V, Σ, R, S)

where

V

&

R

are defined as follows.

V = \{S\} \cup {[p X q] | p, q \in Q, X \in Γ}

Note that

V

is finite because

Q

&

Γ

are finite.

Let

(P)

be the procedure for creating rules in

R

defined as follows.

\forall (q, a, X) \in Q \times Σ_{ϵ} \times Γ

, if

δ (q, a, X) \neq \emptyset

,

\exists (r_{0}, Y_{1} Y_{2} \dots Y_{m}) \in δ (q, a, X)

.

That is,

q \overset{a, X ⟶ Y_{1} Y_{2} \dots Y_{m}}{\to} r_{0}

.

For every

r_{1}, r_{2}, \dots r_{m} \in Q

,

a \in Σ_{ϵ}

, let

[q X r_{m}] ⟶ a [r_{0} Y_{1} r_{1}] [r_{1} Y_{2} r_{2}] \dots [r_{m - 1} Y_{m} r_{m}]

be a rule in

R

.

Note that the total number of rules thus created based on each

(r_{0}, Y_{1} Y_{2} \dots Y_{m})) \in δ (q, a, X)

is finite because

Q

,

m

, &

Σ_{ϵ}

are finite.

Furthermore, the set

δ (q, a, X)

is finite & the total number of such sets,

δ (q, a, X)

is finite because the total number of

(q, a, X) \in Q \times Σ_{ϵ} \times Γ

is finite.

Therefore, the total number of rules thus created for any given

P D A

,

M_{e}

is finite.

Let

R_{1} =

the set of rules created by

(P)

.

R_{2} = {S ⟶ [q_{0} ⟂_{e} p] | p \in Q}

.

R = R_{1} \cup R_{2}

.

The construction of

G

is complete and we now proceed to prove

L (G) = L (M_{e})

.

Claim 1.

S \overset{*}{\Rightarrow} w

iff

[q_{0} ⟂_{e} p] \overset{*}{\Rightarrow} w

for some

p \in Q

.

"

If

S \overset{*}{\Rightarrow} w "

\exists n \geq 1

such that

S \overset{n}{\Rightarrow} w

.

S \overset{1}{\Rightarrow} β \overset{n - 1}{\Rightarrow} w

where

β \in {(V \cup Σ)}^{*}

.

By Proposition 2.8(i),

S ⟶ β

is a rule.

This rule must be from

R_{2}

.

Therefore,

S ⟶ [q_{0} ⟂_{e} p]

for some

p \in Q

.

Therefore,

S \overset{1}{\Rightarrow} [q_{0} ⟂_{e} p] \overset{n - 1}{\Rightarrow} w

.

Therefore,

[q_{0} ⟂_{e} p] \overset{n - 1}{\Rightarrow} w

.

Therefore,

[q_{0} ⟂_{e} p] \overset{*}{\Rightarrow} w

for some

p \in Q

.

"

If

[q_{0} ⟂_{e} p] \overset{*}{\Rightarrow} w f o r s o m e p \in Q "

By construction,

S ⟶ [q_{0} ⟂_{e} p]

is a rule in

R_{2}

.

By Proposition 2.8(i),

S \overset{1}{\Rightarrow} [q_{0} ⟂_{e} p]

.

Therefore,

S \overset{1}{\Rightarrow} [q_{0} ⟂_{e} p] \overset{*}{\Rightarrow} w

.

Therefore,

S \overset{*}{\Rightarrow} w

.

This completes the proof of Claim 1.

Claim 2.

\forall p, q \in Q

,

X \in Γ

,

w \in Σ^{*}

,

[q X p] \overset{*}{\Rightarrow} w

iff

(q, w, X) \overset{*, M_{e}}{\to} (p, ϵ, ϵ)

.

“If”

Assume

(q, w, X) \overset{*, M_{e}}{\to} (p, ϵ, ϵ)

.

\exists n \geq 1

such that

(q, w, X) \overset{n, M_{e}}{\to} (p, ϵ, ϵ)

.

The proof of

[q X p] \overset{*}{\Rightarrow} w

is by induction on

n

.

For

n = 1

,

(q, w, X) \overset{1, M_{e}}{\to} (p, ϵ, ϵ)

Therefore,

q \overset{a, X ⟶ ϵ}{\to} p

where

a \in Σ_{ϵ}

&

w = a ϵ = a

.

By

(P)

, if

q \overset{a, X ⟶ Y_{1} Y_{2} \dots Y_{m}}{\to} r_{0}

, then

\exists

a rule

[q X r_{m}] ⟶ a [r_{0} Y_{1} r_{1}] [r_{1} Y_{2} r_{2}] \dots [r_{m - 1} Y_{m} r_{m}]

for some

r_{1}, r_{2}, \dots r_{m} \in Q

.

In this case,

Y_{1} Y_{2} \dots Y_{m} = ϵ

which means

m = 0

and hence

r_{m} = r_{0}

.

Therefore,

\exists

a rule

[q X r_{0}] ⟶ a

.

Since

p = r_{0}

&

w = a

, the rule becomes

[q X p] ⟶ w

.

By Proposition 2.8(i),

[q X p] ⟹ w

.

Therefore,

[q X p] \overset{*}{\Rightarrow} w

.

The statement is true for

n = 1

.

Assume the statement is true for all

n \leq k

where

k \geq 1

.

That is,

\{(q, w, X) \overset{n, M_{e}}{\to} (p, ϵ, ϵ)\} ⟹ \{[q X p] \overset{*}{\Rightarrow} w\}

for all

n \leq k

.

For

n = k + 1

, assume

(q, w, X) \overset{k + 1, M_{e}}{\to} (p, ϵ, ϵ)

.

\exists Y_{1}, Y_{2}, \dots Y_{m} \in Γ

,

a \in Σ_{ϵ}

,

x \in Σ^{*}

,

w = a x

,

r_{0} \in Q

such that

q \overset{a, X ⟶ Y_{1} Y_{2} \dots Y_{m}}{\to} r_{0}

.

Therefore,

(q, w, X) \overset{1, M_{e}}{\to} (r_{0}, x, Y_{1} Y_{2} \dots Y_{m}) \overset{k, M_{e}}{\to} (p, ϵ, ϵ)

.

Since

(r_{0}, x, Y_{1} Y_{2} \dots Y_{m}) \overset{k, M_{e}}{\to} (p, ϵ, ϵ)

, using the same argument as used in the proof of Lemma 2.58, we can deduce the following computations:

(r_{0}, x_{1}, Y_{1}) \overset{*, M_{e}}{\to} (r_{1}, ϵ, ϵ)

(r_{1}, x_{2}, Y_{2}) \overset{*, M_{e}}{\to} (r_{2}, ϵ, ϵ)

⋮

(r_{i - 1}, x_{i}, Y_{i}) \overset{*, M_{e}}{\to} (r_{i}, ϵ, ϵ)

⋮

(r_{m - 1}, x_{m}, Y_{m}) \overset{*, M_{e}}{\to} (r_{m}, ϵ, ϵ)

where

r_{1},

r_{2} \dots r_{m - 1} \in Q

,

r_{m} = p

,

x_{i}

is the portion of

x

that the machine consumes while popping

Y_{i}

off the stack and returning its stack head to the position right before popping

Y_{i + 1}

off for

i = 1,2, 3, \dots m - 1

and

x_{m}

is the last portion of

x

that the machine consumes while popping

Y_{m}

off the stack and emptying the stack eventually.

Note that the machine goes from state

r_{i - 1}

to state

r_{i}

after completing the above actions &

x = x_{1} x_{2} \dots x_{m}

.

Since each computation

(r_{i - 1}, x_{i}, Y_{i}) \overset{*, M_{e}}{\to} (r_{i}, ϵ, ϵ)

is part of the computation

(r_{0}, x, Y_{1} Y_{2} \dots Y_{m}) \overset{k, M_{e}}{\to} (p, ϵ, ϵ)

, each one makes no more than

k

moves.

By induction hypothesis,

[r_{i - 1} Y_{i} r_{i}] \overset{*}{\Rightarrow} x_{i}

for

i = 1,2, \dots m

.

As shown above,

q \overset{a, X ⟶ Y_{1} Y_{2} \dots Y_{m}}{\to} r_{0}

& since

r_{1},

r_{2} \dots r_{m - 1}, p \in Q

, by

(P)

,

\exists

a rule

[q X p] ⟶ a [r_{0} Y_{1} r_{1}] [r_{1} Y_{2} r_{2}] \dots [r_{m - 1} Y_{m} p]

.

By Proposition 2.8(i),

[q X p] ⟹ a [r_{0} Y_{1} r_{1}] [r_{1} Y_{2} r_{2}] \dots [r_{m - 1} Y_{m} p]

.

Since

a \overset{0}{\Rightarrow} a

&

[r_{i - 1} Y_{i} r_{i}] \overset{*}{\Rightarrow} x_{i}

for

i = 1,2, \dots m

.

a [r_{0} Y_{1} r_{1}] [r_{1} Y_{2} r_{2}] \dots [r_{m - 1} Y_{m} p] \overset{*}{\Rightarrow} a x_{1} x_{2} \dots x_{m}

by Proposition 2.16(d).

Therefore,

[q X p] \overset{*}{\Rightarrow} a x_{1} x_{2} \dots x_{m}

.

Since

w = a x = a x_{1} x_{2} \dots x_{m}

,

[q X p] \overset{*}{\Rightarrow} w

.

Therefore,

\{(q, w, X) \overset{k + 1, M_{e}}{\to} (p, ϵ, ϵ)\} ⟹ \{[q X p] \overset{*}{\Rightarrow} w\}

Therefore, the statement is true for

n = k + 1

.

This completes the proof of the “If” part of Claim 2.

“Only if”

Assume

[q X p] \overset{*}{\Rightarrow} w

.

\exists n \geq 1

such that

[q X p] \overset{n}{\Rightarrow} w

.

The proof of

(q, w, X) \overset{*, M_{e}}{\to} (p, ϵ, ϵ)

is by induction on

n

.

For

n = 1

,

[q X p] \overset{1}{\Rightarrow} w

.

By Proposition 2.8(i),

[q X p] ⟶ w

is a rule in

R_{1}

(It’s not in

R_{2}

because

[q X p] \neq S

).

Since every rule in

R_{1}

is of the form

[q X r_{m}] ⟶ a [r_{0} Y_{1} r_{1}] [r_{1} Y_{2} r_{2}] \dots [r_{m - 1} Y_{m} r_{m}]

where

q \overset{a, X ⟶ Y_{1} Y_{2} \dots Y_{m}}{\to} r_{0}

&

r_{1},

r_{2} \dots r_{m - 1}, r_{m} \in Q

.

In this particular case,

w

is not a variable.

Therefore,

[r_{0} Y_{1} r_{1}] [r_{1} Y_{2} r_{2}] \dots [r_{m - 1} Y_{m} r_{m}] = ϵ

or

m = 0

.

Therefore,

a = w

&

r_{m} = r_{0}

.

Since

[q X p] = [q X r_{m}]

,

p = r_{m} = r_{0}

.

We must have

q \overset{a, X ⟶ ϵ}{\to} p

.

Therefore,

(q, w, X) \overset{1, M_{e}}{\to} (p, x, ϵ)

where

w = a x

.

As shown above,

a = w

.

Therefore,

x = ϵ

.

Therefore,

(q, w, X) \overset{1, M_{e}}{\to} (p, ϵ, ϵ)

.

Therefore,

(q, w, X) \overset{*, M_{e}}{\to} (p, ϵ, ϵ)

.

The statement is true for

n = 1

.

For induction hypothesis, assume it is true that

\{[q X p] \overset{n}{\Rightarrow} w\} ⟹ \{(q, w, X) \overset{*, M_{e}}{\to} (p, ϵ, ϵ)\}

for all

n \leq k

where

k \geq 1

.

For

n = k + 1

, assume

[q X p] \overset{k + 1}{\Rightarrow} w

.

Therefore,

[q X p] \overset{1}{\Rightarrow} β \overset{k}{\Rightarrow} w

where

β \in {(V \cup Σ)}^{*}

.

By Proposition 2.8(i),

[q X p] ⟶ β

is a rule in

R_{1}

.

This rule must be of the form

[q X p] ⟶ a [r_{0} Y_{1} r_{1}] [r_{1} Y_{2} r_{2}] \dots [r_{m - 1} Y_{m} p]

where

{r_{0}, r}_{1},

r_{2} \dots r_{m - 1} \in Q

,

a \in Σ_{ϵ}

,

Y_{1}, Y_{2}, \dots Y_{m} \in Γ

,

q \overset{a, X ⟶ Y_{1} Y_{2} \dots Y_{m}}{\to} r_{0}

.

[q X p] \overset{1}{\Rightarrow} a [r_{0} Y_{1} r_{1}] [r_{1} Y_{2} r_{2}] \dots [r_{m - 1} Y_{m} p]

by Proposition 2.8(i).

Therefore,

[q X p] \overset{1}{\Rightarrow} a [r_{0} Y_{1} r_{1}] [r_{1} Y_{2} r_{2}] \dots [r_{m - 1} Y_{m} p] \overset{k}{\Rightarrow} w

.

By Proposition 2.28(ii),

\exists w_{1}, w_{2} \dots w_{m} \in Σ^{*}

such that

[r_{i - 1} Y_{i} r_{i}] \overset{*}{\Rightarrow} w_{i}

in no more than

k

steps for

i = 1,2, \dots m

and

w = a w_{1} w_{2} \dots w_{m}

.

By induction hypothesis,

(r_{i - 1}, w_{i}, Y_{i}) \overset{*, M_{e}}{\to} (r_{i}, ϵ, ϵ)

for

i = 1,2, \dots m

.

By Proposition 2.50,

(r_{i - 1}, w_{i} w_{i + 1} \dots w_{m}, Y_{i} Y_{i + 1} \dots Y_{m}) \overset{*, M_{e}}{\to} (r_{i}, w_{i + 1} \dots w_{m}, Y_{i + 1} \dots Y_{m})

for

i = 1,2, \dots m

.

(r_{0}, w_{1} w_{2} \dots w_{m}, Y_{1} Y_{2} \dots Y_{m}) \overset{*, M_{e}}{\to} (r_{1}, w_{2} \dots w_{m}, Y_{2} \dots Y_{m})

for

i = 1

.

(r_{1}, w_{2} \dots w_{m}, Y_{2} \dots Y_{m}) \overset{*, M_{e}}{\to} (r_{2}, w_{3} \dots w_{m}, Y_{3} \dots Y_{m})

for

i = 2

.

⋮

(r_{m - 1}, w_{m}, Y_{m}) \overset{*, M_{e}}{\to} (r_{m}, ϵ, ϵ)

for

i = m - 1

where

r_{m} = p

.

Furthermore, as shown above,

q \overset{a, X ⟶ Y_{1} Y_{2} \dots Y_{m}}{\to} r_{0}

.

Therefore,

(q, a w_{1} w_{2} \dots w_{m}, X) \overset{1, M_{e}}{\to} (r_{0}, w_{1} w_{2} \dots w_{m}, Y_{1} Y_{2} \dots Y_{m})

.

Since

w = a w_{1} w_{2} \dots w_{m}

,

(q, w, X) \overset{1, M_{e}}{\to} (r_{0}, w_{1} w_{2} \dots w_{m}, Y_{1} Y_{2} \dots Y_{m})

.

Connecting all the computations, we have

(q, w, X) \overset{1, M_{e}}{\to} (r_{0}, w_{1} w_{2} \dots w_{m}, Y_{1} Y_{2} \dots Y_{m}) \overset{*, M_{e}}{\to} (r_{1}, w_{2} \dots w_{m}, Y_{2} \dots Y_{m})

\overset{*, M_{e}}{\to} (r_{2}, w_{3} \dots w_{m}, Y_{3} \dots Y_{m}) \overset{*, M_{e}}{\to} \dots \overset{*, M_{e}}{\to} (r_{m - 1}, w_{m}, Y_{m}) \overset{*, M_{e}}{\to} (p, ϵ, ϵ)

.

Therefore,

(q, w, X) \overset{*, M_{e}}{\to} (p, ϵ, ϵ)

.

Therefore,

\{[q X p] \overset{k + 1}{\Rightarrow} w\} ⟹ \{(q, w, X) \overset{*, M_{e}}{\to} (p, ϵ, ϵ)\}

.

This completes the proof of Claim 2.

We now get back to the proof of

L (G) = L (M_{e})

.

w \in L (G) ⟺ S \overset{*}{\Rightarrow} W

⟺ [q_{0} ⟂_{e} p] \overset{*}{\Rightarrow} w

for some

p \in Q

(Claim 1)

⟺ (q_{0}, w, ⟂_{e}) \overset{*, M_{e}}{\to} (p, ϵ, ϵ)

(Claim 2)

⟺ M_{e}

accepts

w

⟺ w \in L (M_{e})

Therefore,

L (G) = L (M_{e})

.

This completes the proof of Lemma 2.59.

Combining Lemma 2.58 and Lemma 2.59, we have the following theorem.

Theorem 2.60. For any

C F G G

,

\exists

a

P D A

M_{e}

such that

L (G) = L (M_{e})

.

Conversely, for any

P D A

M_{e}

,

\exists

a

C F G G

such that

L (G) = L (M_{e})

.

2.5. The Pumping Lemma for Context Free Languages

In this section, we shall develop a tool for showing that a language is not context free. This tool is called “The Pumping Lemma for context free languages.” This Pumping Lemma is analogous to the pumping lemma we study in Chapter 1 for regular languages. The difference this time is we are pumping two strings rather than one and the string that we are dealing with is broken down into five substrings in contrast to three substrings in the case of regular languages.

Theorem 2.61.

Let

G = (V, Σ, R, S)

be a

C F G

in Chomsky Normal Form.

Let

P t (A, w, h)

be the parse tree corresponding to this grammar in accordance with the meaning of Theorem 2.33 where

A (\in V)

is the root,

w (\in Σ^{*})

is the yield and

h

is the height of the parse tree. Then it is true that

|w| \leq 2^{h - 1}

.

Proof.

The proof of this theorem is by induction on

h

.

For

h = 1

,

P t (A, w, h)

is a 1-level tree with

A

at the zero level and

w

at the first level.

The only forms of rules in Chomsky Normal Form are:

A ⟶ B C

where

A ϵ V

and

B, C ϵ V \ {S}

A ⟶ a

where

a ϵ Σ \subset Σ^{*}

S ⟶ ϵ

where

S =

Start Variable.

Since

w \in Σ^{*}

, we have either

A ⟶ a

or

S ⟶ ϵ

.

Therefore,

w = a

or

w = ϵ

.

w = a \Rightarrow |w| = 1 \Rightarrow |w| = 2^{0} \leq 2^{h - 1}

.

w = ϵ \Rightarrow |w| = 0 \Rightarrow |w| \leq 2^{0} \leq 2^{h - 1}

.

Either case, we have statement being true for

h = 1

.

For induction hypothesis, assume the statement is true for all

h \leq k

where

k \geq 1

.

Consider a parse tree,

P t (A, w, k + 1)

, that correspond to

G

according to the meaning of Theorem 2.33.

Since

k \geq 1

, the height of

P t (A, w, k + 1)

is greater than or equal to

2

. Hence the children of A which appear in the first level cannot be

a

or

ϵ

.

They must be

B

and

C

with

B, C ϵ V \ {S}

.

Using similar argument as we use in proving Theorem 2.33, we can show the following:

(i) The combination of all branches of

B

(respectively

C

) form a subtree

P t (B, w_{1}, h_{1})

(respectively

P t (C, w_{2}, h_{2})

)

(ii)

h_{1} \leq k

&

h_{2} \leq k

(iii)

w = w_{1} w_{2}

.

By (ii) and induction hypothesis,

|w_{1}| \leq 2^{h_{1} - 1}

&

|w_{2}| \leq 2^{h_{2} - 1}

.

|w| = |w_{1}| + |w_{2}|

\leq 2^{h_{1} - 1} + 2^{h_{2} - 1}

\leq 2^{k - 1} + 2^{k - 1}

= 2 ∙ 2^{k - 1}

= 2^{k}

= 2^{(k + 1) - 1}

This completes the induction proof of Theorem 2.61.

Proposition 2.62.

Let

P t (A, z, h)

be a parse tree for

C F G, G = (V, Σ, R, S)

and

P t (B, w, k)

be the largest subtree of

P t (A, z, h)

where

w

,

z \in Σ^{*}

. Then

\exists x, y \in Σ^{*}

such that

z = x w y

. Furthermore, the nodes on any path from

A

to

x

(respectively

y

) cannot be a node in

P t (B, w, k)

.

Proof. By T13, every leaf of a subtree is also a leaf of the parent tree.

Therefore,

w ⊏ z

.

Therefore,

z = x w y

for some

x, y \in Σ^{*}

.

Let

(A, v_{1}, v_{2}, \dots v_{n}, l)

be a path from

A

to

l

where

l

is a symbol in

x

.

There exists a

v_{i} (i = 1,2, \dots n)

on this path such that

v_{i}

and

B

are at the same level.

If

v_{i}

and

B

are the same node,

(v_{i}, \dots v_{n}, l)

is a branch rooted at

B

and by T11, it is a path inside

P t (B, w, k)

.

This means that

l

is a symbol in

w

and this contradicts the assumption that

l

is a symbol in

x

.

Therefore,

v_{i}

cannot be the same node as

B

.

Since

l

is to the left of every symbol in

w

and

v_{i}

is an ancestor of

l

, by T12,

v_{i}

is to the left of

B

.

Let

v_{j}

be a node on the path

(A, v_{1}, v_{2}, \dots v_{i} \dots v_{n}, l)

.

If

j < i

,

v_{j}

is above the level of

B

and hence

v_{j}

is not a node in

P t (B, w, k)

.

If

j > i

,

v_{j}

is a descendant of

v_{i}

and hence by T12,

v_{j}

is to the left of all descendants of B at the same level.

Therefore,

v_{j}

cannot be a node in

P t (B, w, k)

.

With similar argument, we can also prove that if

v_{j}

is a node on a path from

A

to any

l^{'}

in

y

,

v_{j}

cannot be a node in

P t (B, w, k)

.

This completes the proof of Proposition 2.62.

Theorem 2.63. Let

P t (S, z, h)

be a parse tree for

C F G, G = (V, Σ, R, S)

and

P t (A, w, k)

be the largest subtree of

P t (S, z, h)

rooted at

A

such that

z = x w y

where

x, y, w, z \in Σ^{*}

.

If

P t (A, w, k)

is replaced by another parse tree

P t (A, w^{'}, k^{'})

, to form a new tree

P t (S, z^{'}, h^{'})

, then

z^{'} = x w^{'} y

.

Proof.

By Proposition 2.62, we can write

z^{'} = x^{'} w^{'} y^{'}

for some

x^{'}

,

y^{'} \in Σ^{*}

because

P t (A, w^{'}, k^{'})

is a subtree of

P t (S, z^{'}, h^{'})

.

See Figure 2.13 below.

Figure 2.13. Caption.

Let

l

be a leaf in

x

.

By T8, there is a unique path from

S

to

l

in

P t (S, z, h)

.

Let’s call this path

(S, v_{1}, v_{2}, \dots v_{i} \dots v_{n}, l)

where

v_{i}

is at the same level of

A

.

By Proposition 2.62,

(S, v_{1}, v_{2}, \dots v_{i} \dots v_{n}, l)

is not affected by the removal of

P t (A, w, k)

and in addition,

v_{i}

is to the left of

A

.

Therefore,

(S, v_{1}, v_{2}, \dots v_{i} \dots v_{n}, l)

remains a path in the new

P t (S, z^{'}, h^{'})

.

Therefore,

l

is a leaf in

z^{'}

.

l

is not in

w^{'}

because

w^{'}

consists of all the leaves created from the addition of

P t (A, w^{'}, k^{'})

.

If

l

is in

y^{'}

, the ancestor of

l

at the level of

A

, namely

v_{i}

, must be to the right of

A

which contradicts what we have shown above and that is

v_{i}

is to the left of

A

.

Therefore,

l

cannot be in

y^{'}

.

Therefore,

l

is in

x^{'}

.

Therefore,

x ⊏ x^{'}

.

Conversely, if

l^{'}

is a leaf in

x^{'}

,

by T8, there is a unique path from

S

to

l^{'}

in

P t (S, z^{'}, h^{'})

.

Let’s call this path

(S, v_{1}^{'}, v_{2}^{'}, \dots v_{i}^{'} \dots v_{n}^{'}, l^{'})

where

v_{i}^{'}

and

A

are at the same level.

By Proposition 2.62,

v_{1}^{'}, v_{2}^{'}, \dots v_{i}^{'} \dots v_{n}^{'}, l^{'}

are not in

P t (A, w^{'}, k^{'})

and

v_{i}^{'}

is to the left of

A

.

These nodes must have come from

P t (S, z, h)

.

In addition, they are not in

P t (A, w, k)

either because if they were, they would have been eliminated by the replacement of

P t (A, w, k)

.

Therefore,

l^{'}

is not in

w

.

If

l^{'}

is in

y

,

v_{i}^{'}

would be to the right of

A

, which contradicts what we have shown above and that is

v_{i}^{'}

is to the left of

A

.

Therefore,

l^{'}

cannot be in

y

.

Therefore,

l^{'}

is in

x

.

Therefore,

x^{'} ⊏ x

.

Therefore,

x = x^{'}

.

With similar argument, we can also prove that

y = y^{'}

.

This completes the proof of Theorem 2.63.

Theorem 2.64A (The Pumping Lemma for

C F L

s). Let

L

be a

C F L

.

\exists p > 0

such that if

z \in L

and

|z| \geq p

, then

z = u v w x y

for some

u, v, w, x, y \in Σ^{*}

with the following conditions satisfied:

(i)

\forall i \geq 0

,

u v^{i} w x^{i} y \in L

(ii)

v x \neq ϵ

(iii)

|v w x| \leq p

Proof.

By Theorem 2.37, there exists a

C F G

,

G = (V, Σ, R, S)

in Chomsky Normal Form such that

L = L (G)

. Let

p = 2^{m}

where

m = |V| =

the number of variables in

V

.

If

z \in L

,

z \in L (G)

.

S \overset{*}{\Rightarrow} z

By Theorem 2.33, there is a parse tree for

G

with root

S

and yield

z

.

Let this parse tree be represented by

P t (S, z, h)

where

h

is the height of the tree.

By Theorem 2.61,

|z| \leq 2^{h - 1}

.

If

|z| \geq p = 2^{m}

,

2^{m} \leq 2^{h - 1}

.

m \leq h - 1

.

h \geq m + 1

.

By T9,

\exists

a path from

S

to

a

where

a

is a leaf in

z

such that the length of this path is equal to

h

.

(Note that this is the longest path in the tree.)

Since

h \geq m + 1

, there are at least

m + 2

nodes on this path.

Let

(V_{1}, V_{2}, \dots \dots V_{m}, V_{m + 1}, a)

be the lowest portion of this path where

V_{1}, V_{2}, \dots \dots V_{m}, V_{m + 1} \in V

.

See Figure 2.14 below.

Figure 2.13. Caption.

Note that this is the longest path from

V_{1}

to a leaf.

Since

m = |V|

, by the pigeonhole principle,

\exists 1 \leq i < j \leq m + 1

such that

V_{i} = V_{j}

.

Let

P t (V_{j}, w, h_{j})

be the largest subtree rooted at

V_{j}

and

P t (V_{i}, w^{'}, h_{i})

be the largest subtree rooted at

V_{i}

.

See Figure 2.15 below.

Figure 2.13. Caption.

As can be seen in Figure 2.15,

P t (V_{j}, w, h_{j})

is a subtree of

P t (V_{i}, w^{'}, h_{i})

which in turn is a subtree of the parent tree

P t (S, z, h)

.

By Proposition 2.62, we can write the yield of

P t (V_{i}, w^{'}, h_{i})

as

v w x

where

v, x \in Σ^{*}

and the yield of

P t (S, z, h)

as

u v w x y

where

u, y \in Σ^{*}

.

That is,

z = u v w x y

and

w^{'} = v w x

.

Since

V_{i} = V_{j}

, we can replace

P t (V_{j}, w, h_{j})

by

P t (V_{i}, v w x, h_{i})

to form a new parse tree.

By Theorem 2.63, the yield of this new parse tree is

u v v w x x y = u v^{2} w x^{2} y

.

By repeated application of this replacement procedure, we can create new parse trees

P t (S, u v^{i} w x^{i} y, k_{i})

for

i \geq 2

.

By Theorem 2.33,

S \overset{*}{\Rightarrow} u v^{i} w x^{i} y

for

i \geq 2

.

If we replace

P t (V_{i}, v w x, h_{i})

by

P t (V_{j}, w, h_{j})

, we obtain a new parse tree

P t (S, u w y, k_{0})

.

Again by Theorem 2.33,

S \overset{*}{\Rightarrow} u w y

.

Or

S \overset{*}{\Rightarrow} u v^{0} w x^{0} y

.

When

i = 1

,

z = u v w x y

and we know

S \overset{*}{\Rightarrow} z

.

Therefore,

S \overset{*}{\Rightarrow} u v^{i} w x^{i} y

for

i \geq 0

.

Therefore,

u v^{i} w x^{i} y \in L

for

i \geq 0

.

This proves Condition (i) is satisfied.

The only three forms of rules in a

C F G

in

C N F

are:

A ⟶ B C

where

A ϵ V

and

B, C ϵ V \ {S}

A ⟶ a

where

a ϵ Σ

S ⟶ ϵ

where

S =

Start Variable

a

and

ϵ

cannot be the children of

V_{i}

because

a

and

ϵ

cannot have descendant

V_{j}

.

Let

B, C \in V

be the two children of

V_{i}

.

Let

P t (B, b, h_{b})

&

P t (C, c, h_{c})

be the largest sub parse tree with yields

b, c \in Σ^{*}

and roots

B, C \in V

.

Using similar argument as used in the proof Theorem 2.33, we can show that

b c = v w x

.

By T7,

V_{j}

is either a descendant of

B

or

C

.

If

V_{j}

is a descendant of

B

,

P t (V_{j}, w, h_{j})

is a subtree of

P t (B, b, h_{b})

.

By Proposition 2.62,

w ⊏ b

and

b = w_{1} w w_{2}

for some

w_{1}

,

w_{2} \in Σ^{*}

.

Therefore,

v w x = b c = w_{1} w w_{2} c

.

v = w_{1}

&

x = w_{2} c

.

Since

C

and all its descendants are not

S

,

c

cannot be

ϵ

.

x \neq ϵ

.

Therefore,

v x \neq ϵ

.

If

V_{j}

is a descendant of

C

, with similar argument, we can show that

v \neq ϵ

and hence

v x \neq ϵ

.

In all cases, Condition (ii) is satisfied.

Since

V_{i}

is a descendant of

V_{1}

,

P t (V_{i}, v w x, h_{i})

is a subtree of

P t (V_{1}, z_{1}, h_{1})

.

By Proposition 2.62,

v w x ⊏ z_{1}

.

|v w x| \leq |z_{1}|

.

By Theorem 2.61,

|z_{1}| \leq 2^{h_{1} - 1}

.

Therefore,

|v w x| \leq 2^{h_{1} - 1}

.

Since

(V_{1}, V_{2}, \dots \dots V_{m}, V_{m + 1}, a)

is the longest path from

V_{1}

to a leaf,

h_{1} =

the length of

(V_{1}, V_{2}, \dots \dots V_{m}, V_{m + 1}, a) = m + 1

.

Therefore,

|v w x| \leq 2^{m + 1 - 1} = 2^{m} = p

.

Therefore, Condition (iii) is satisfied.

This completes the proof of Theorem 2.64A.

Theorem 2.64B Pumping Lemma (contra positive form).

~ (S) ⟹ L

is not context free where

~ (S)

is equivalent to:

\forall p \geq 1, \exists s \in L

with

|s| \geq p

such that whenever

s = u v w x y

, at least one of the conditions (i), (ii), or (iii) cannot be satisfied.

The contra positive form of the Pumping Lemma is used to prove a language is not context free. The general strategy is to find an

s \in L

with

|s| \geq p

for any given

p \geq 1

so that whenever

s

is broken into

s = u v w x y

, at least one of the conditions of (i), (ii), or (iii) must be false. This can be usually accomplished by showing one of the following:

(1) Condition (i) alone is false.

(2) Condition (iii)

⟹ ~

Condition (i)

(3) (Condition (ii) and Condition (iii))

⟹ ~

Condition (i).

Example 2.65.

Show that

L = \{a^{n} b^{n} c^{n} | n \geq 0\}

is not

C F L

.

\forall p \geq 1

, construct

s = a^{p} b^{p} c^{p}

.

See figure below.

S = \underset{p}{\underset{⏟}{a \dots \dots \dots a}} \underset{p}{\underset{⏟}{b \dots \dots \dots b}} \underset{p}{\underset{⏟}{c \dots \dots \dots c}}

s \in L

and

|s| \geq p

.

Assume

s = u v w x y

.

If Condition (iii) is true,

|v w x| \leq p

.

There are 5 cases for consideration.

(1)

v w x = a^{n}

where

n \leq p

(2)

v w x = a^{n} b^{m}

where

n \leq p & m \leq p

(3)

v w x = b^{n}

where

n \leq p

(4)

v w x = b^{n} c^{m}

where

n \leq p & m \leq p

(5)

v w x = c^{n}

where

n \leq p

For case (1),

v w x = a^{n} ⟹ v^{2} w x^{2} = a^{n^{'}}

.

If Condition (ii) is true,

v x \neq ϵ

.

Either

v \neq ϵ

or

x \neq ϵ

.

This means

n^{'} > n

.

s = u v w x y = u a^{n} y

contains the same number of

a ’ s

,

b ’ s

and

c ’ s

.

u v^{2} w x^{2} y = u a^{n^{'}} y

contains more

a ’ s

than

s

and therefore has more

a ’ s

than

b ’ s

and

c ’ s

in itself.

Therefore,

u v^{2} w x^{2} y

is not in

L

.

Therefore, (Condition (ii) and Condition (iii))

⟹ ~

Condition (i).

Similar arguments can be made in cases (3) and (5) to arrive at the same conclusion as in case (1).

For case (2),

s = u v w x y = u a^{n} b^{m} y

contains the same number of

a ’ s

,

b ’ s

and

c ’ s

.

If Condition (ii) is true,

v x \neq ϵ

.

Either

v \neq ϵ

or

x \neq ϵ

.

v w x = a^{n} b^{m} ⟹ v^{2} w x^{2}

will increase the number of

a ’ s

or the number of

b ’ s

or both.

u v^{2} w x^{2} y

will have more

a ’ s

than

c ’ s

or more

b ’ s

than

c ’ s

.

Either way,

u v^{2} w x^{2} y

is not in

L

.

Therefore, (Condition (ii) and Condition (iii))

⟹ ~

Condition (i).

Similar arguments can be made in cases (4) to arrive at the same conclusion as in case (2).

Combining all 5 cases, we conclude (Condition (ii) and Condition (iii))

⟹ ~

Condition (i).

By Theorem 2.64B,

L

is not context free.

Example 2.66.

Show that

L = \{w # w | w \in {\{0,1\}}^{*}\}

is not

C F L

.

\forall p \geq 1

, construct

s = 0^{p} 1^{p} # 0^{p} 1^{p}

.

s \in L

and

|s| = 4 p + 1 > p

.

Assume

s = u v w x y

.

If both Conditions (ii) & (iii) are true, we have the following cases to consider.

(1)

v w x

is to the left of #

See figure below.

\overset{p}{\overset{⏞}{00 \dots \dots \dots \dots 00}} \overset{p}{\overset{⏞}{11 \dots \dots \dots \dots 11}}

#

\overset{p}{\overset{⏞}{00 \dots \dots \dots \dots 00}} \overset{p}{\overset{⏞}{11 \dots \dots \dots \dots 11}}

⇤ v w x ⇥

Condition (3)

⟹ |v w x| \leq p

which makes it possible for

v w x

to be contained in

0^{p} 1^{p}

.

Since Condition (ii) is true,

v x \neq ϵ

.

Pumping up to

u v^{2} w x^{2} y

will increase the number of symbols on the left of the # sign while not changing the symbols on the right.

This makes it impossible for

u v^{2} w x^{2} y

to remain in

L

.

Therefore, (Condition (ii) and Condition (iii))

⟹ ~

Condition (i).

(2)

v w x

is to the right of #

Similar argument can be made to lead to same conclusion as in (1).

(3)

v w x

contains the # sign

(i)

w

contains # (See figure below.)

\overset{p}{\overset{⏞}{00 \dots \dots \dots \dots 00}} \overset{p}{\overset{⏞}{11 \dots \dots \dots \dots 11}}

#

\overset{p}{\overset{⏞}{00 \dots \dots \dots \dots 00}} \overset{p}{\overset{⏞}{11 \dots \dots \dots \dots 11}}

⇤ v ⇥ ⇤ w ⇥ ⇤ x ⇥

Condition (iii)

⟹ |v w x| \leq p ⟹ (\begin{matrix} v c o n t a i n s o n l y 1^{'} s i f v \neq ϵ a n d \\ x c o n t a i n s o n l y 0^{'} s i f x \neq ϵ \end{matrix})

.

Condition (ii)

⟹ v x \neq ϵ ⟹

one of

v

or

x

is not

ϵ

.

Pumping down

⟹ u w y = 0^{p} 1^{i} # 0^{j} 1^{p}

.

v \neq ϵ ⟹ i < p ⟹ 0^{p} 1^{i} # 0^{j} 1^{p} \notin L ⟹ u w y \notin L

.

x \neq ϵ ⟹ j < p ⟹ 0^{p} 1^{i} # 0^{j} 1^{p} \notin L ⟹ u w y \notin L

.

Therefore, (Condition (ii) and Condition (iii))

⟹ ~

Condition (i).

(ii)

w

is to the left of # (See figure below.)

\overset{p}{\overset{⏞}{00 \dots \dots \dots \dots 00}} \overset{p}{\overset{⏞}{11 \dots \dots \dots \dots 11}}

#

\overset{p}{\overset{⏞}{00 \dots \dots \dots \dots 00}} \overset{p}{\overset{⏞}{11 \dots \dots \dots \dots 11}}

⇤ v w ⇥ ⇤ x ⇥

Since

v w x

contains the # sign,

x \neq ϵ

.

Therefore,

x

contains #. (See figure above.)

Pumping down will eliminate the # sign making it impossible for

u w y

to remain in

L

.

Therefore,

u w y

is not in

L

and Condition (i) cannot be satisfied.

(iii)

w

is to the right of #

Similar argument will lead to the same conclusion as in case 3(ii) above.

Combining all possible cases, (Condition (ii) and Condition (iii))

⟹ ~

Condition (i).

By Theorem 2.64B,

L

is not context free.

Example 2.67.

Show that the intersection of two

C F L

s may not be a

C F L

.

Let

L_{1} = \{a^{n} b^{n} c^{m} | n, m \in N\}

L_{2} = \{a^{n} b^{m} c^{m} | n, m \in N\}

L_{1} \cap L_{2} = \{a^{n} b^{n} c^{n} | n \in N\}

, which is not context free as shown in Example 2.65.

L_{1}

can be generated by the following

C F G

rules:

S ⟶ T D

T ⟶ a T b | ϵ

D ⟶ D c | ϵ

L_{2}

can be generated by the following

C F G

rules:

S ⟶ A B

A ⟶ A a | ϵ

B ⟶ b B c | ϵ

Therefore,

L_{1}

and

L_{2}

are

C F L

s.

Example 2.68.

Show that Show that

L = \{w w | w \in {\{0,1\}}^{*}\}

is not

C F L

.

\forall p \geq 1

, construct

s = 0^{p} 1^{p} 0^{p} 1^{p}

.

s \in L

and

|s| = 4 p > p

.

Assume

s = u v w x y

for some

u, v, w, x, y \in {\{0,1\}}^{*}

.

Claim 1. If

i < p

, the strings

0^{i} 1^{p} 0^{p} 1^{p}, 0^{p} 1^{i} 0^{p} 1^{p}, 0^{p} 1^{p} 0^{i} 1^{p}, 0^{p} 1^{p} 0^{p} 1^{i}

are not in

L

.

Assume for contradiction that

0^{i} 1^{p} 0^{p} 1^{p} \in L

.

\exists r \in {\{0,1\}}^{*}

such that

0^{i} 1^{p} 0^{p} 1^{p} = r r

.

Therefore,

|r r| = (i + p) + (p + p)

|r| = \frac{(i + p) + (p + p)}{2}

Since

|r|

is the arithmetic mean of

(i + p)

and

(p + p)

and

(i + p) < (p + p)

,

|r| > i + p

and

|r| < p + p

.

Therefore,

|r| \geq i + p + 1

.

The leftmost

i + p + 1

symbols of

0^{i} 1^{p} 0^{p} 1^{p}

form the substring

0^{i} 1^{p} 0

.

The leftmost

|r|

symbols of

0^{i} 1^{p} 0^{p} 1^{p}

form the substring

r

.

Since

|r| \geq i + p + 1

,

0^{i} 1^{p} 0 ⊏ r

.

Therefore,

10 ⊏ 0^{i} 1^{p} 0 ⊏ r

.

Similarly, the rightmost 2

p

symbols of

0^{i} 1^{p} 0^{p} 1^{p}

form the substring

0^{p} 1^{p}

and the rightmost

|r|

symbols of

0^{i} 1^{p} 0^{p} 1^{p}

form the substring

r

.

Since

|r| < p + p

,

r ⊏ 0^{p} 1^{p}

.

Therefore,

10 ⊏ r ⊏ 0^{p} 1^{p}

.

This is a contradiction because

10

cannot be a substring of

0^{p} 1^{p}

.

Therefore,

0^{i} 1^{p} 0^{p} 1^{p} \notin L

.

Similar arguments can be made to show

0^{p} 1^{i} 0^{p} 1^{p}, 0^{p} 1^{p} 0^{i} 1^{p}, 0^{p} 1^{p} 0^{p} 1^{i}

are not in

L

.

Claim 2.

If at least one of

i, j

is less than

p

, the strings

0^{i} 1^{j} 0^{p} 1^{p}, 0^{p} 1^{i} 0^{j} 1^{p}, 0^{p} 1^{p} 0^{i} 1^{j}

are not in

L

.

Assume for contradiction

0^{i} 1^{j} 0^{p} 1^{p}

is in

L

.

0^{i} 1^{j} 0^{p} 1^{p} = r r

for some

r \in {\{0,1\}}^{*}

.

Therefore,

|r r| = (i + j) + (p + p)

|r| = \frac{(i + j) + (p + p)}{2}

Since

|r|

is the arithmetic mean of

(i + j)

and

(p + p)

and

(i + j) < (p + p)

,

|r| > i + j

and

|r| < p + p

.

Therefore,

|r| \geq i + j + 1

.

The leftmost

i + j + 1

symbols of

0^{i} 1^{j} 0^{p} 1^{p}

form the substring

0^{i} 1^{j} 0

.

The leftmost

|r|

symbols of

0^{i} 1^{j} 0^{p} 1^{p}

form the substring

r

.

Since

|r| \geq i + j + 1

,

0^{i} 1^{j} 0 ⊏ r

.

Therefore,

10 ⊏ 0^{i} 1^{j} 0 ⊏ r

.

Similarly, the rightmost 2

p

symbols of

0^{i} 1^{j} 0^{p} 1^{p}

form the substring

0^{p} 1^{p}

and the rightmost

|r|

symbols of

0^{i} 1^{j} 0^{p} 1^{p}

form the substring

r

.

Since

|r| < p + p

,

r ⊏ 0^{p} 1^{p}

.

Since

10 ⊏ r

&

r ⊏ 0^{p} 1^{p}

,

10 ⊏ 0^{p} 1^{p}

.

This is a contradiction because

10

cannot be a substring of

0^{p} 1^{p}

.

Therefore,

0^{i} 1^{j} 0^{p} 1^{p} \notin L

.

Similar arguments can be made to show that

0^{p} 1^{i} 0^{j} 1^{p}, 0^{p} 1^{p} 0^{i} 1^{j}

are not in

L

.

Returning to the proof that

L

is not

C F L

, we assume both Condition (ii) and Condition (iii) are true.

Since

|v w x| \leq p

, we have 7 cases to consider.

(1)

v w x

is a substring of the first block of

0^{p}

.

\overset{p}{\overset{⏞}{00 \dots \dots \dots \dots 00}} \overset{p}{\overset{⏞}{11 \dots \dots \dots \dots 11}} \overset{p}{\overset{⏞}{00 \dots \dots \dots \dots 00}} \overset{p}{\overset{⏞}{11 \dots \dots \dots \dots 11}}

⇤ v w x ⇥

(2)

v w x

is a substring of the first block of

1^{p}

.

\overset{p}{\overset{⏞}{00 \dots \dots \dots \dots 00}} \overset{p}{\overset{⏞}{11 \dots \dots \dots \dots 11}} \overset{p}{\overset{⏞}{00 \dots \dots \dots \dots 00}} \overset{p}{\overset{⏞}{11 \dots \dots \dots \dots 11}}

⇤ v w x ⇥

(3)

v w x

is a substring of the second block of

0^{p}

.

\overset{p}{\overset{⏞}{00 \dots \dots \dots \dots 00}} \overset{p}{\overset{⏞}{11 \dots \dots \dots \dots 11}} \overset{p}{\overset{⏞}{00 \dots \dots \dots \dots 00}} \overset{p}{\overset{⏞}{11 \dots \dots \dots \dots 11}}

⇤ v w x ⇥

(4)

v w x

is a substring of the second block of

1^{p}

.

\overset{p}{\overset{⏞}{00 \dots \dots \dots \dots 00}} \overset{p}{\overset{⏞}{11 \dots \dots \dots \dots 11}} \overset{p}{\overset{⏞}{00 \dots \dots \dots \dots 00}} \overset{p}{\overset{⏞}{11 \dots \dots \dots \dots 11}}

⇤ v w x ⇥

(5)

v w x

straddles the first block of

0^{p}

and the first block of

1^{p}

.

\overset{p}{\overset{⏞}{00 \dots \dots \dots \dots 00}} \overset{p}{\overset{⏞}{11 \dots \dots \dots \dots 11}} \overset{p}{\overset{⏞}{00 \dots \dots \dots \dots 00}} \overset{p}{\overset{⏞}{11 \dots \dots \dots \dots 11}}

⇤ v w x ⇥

(6)

v w x

straddles the first block of

1^{p}

and the second block of

0^{p}

.

\overset{p}{\overset{⏞}{00 \dots \dots \dots \dots 00}} \overset{p}{\overset{⏞}{11 \dots \dots \dots \dots 11}} \overset{p}{\overset{⏞}{00 \dots \dots \dots \dots 00}} \overset{p}{\overset{⏞}{11 \dots \dots \dots \dots 11}}

⇤ v w x ⇥

(7)

v w x

straddles the second block of

0^{p}

and the second block of

1^{p}

.

\overset{p}{\overset{⏞}{00 \dots \dots \dots \dots 00}} \overset{p}{\overset{⏞}{11 \dots \dots \dots \dots 11}} \overset{p}{\overset{⏞}{00 \dots \dots \dots \dots 00}} \overset{p}{\overset{⏞}{11 \dots \dots \dots \dots 11}}

⇤ v w x ⇥

In case (1),

v

consists of all

0

’s if

v \neq ϵ

and

x

consists of all

0

’s if

x \neq ϵ

.

Pumping down would only affect the first block of

0^{p}

and not the other 3 blocks.

Therefore,

u w y = 0^{i} 1^{p} 0^{p} 1^{p}

.

Since

v x \neq ϵ

by Condition (ii), one of

v

and

x

is not

ϵ

.

Pumping down would reduce the number of

0

’s in the first block of

0^{p}

.

Therefore,

i < p

.

By Claim 1,

u v^{0} w x^{0} y = u w y = 0^{i} 1^{p} 0^{p} 1^{p} \notin L

.

Therefore, Condition (i) is not satisfied.

For cases (2), (3) and (4), similar arguments can be made to lead to the same conclusion as in (1).

For case (5),

|v w x| \leq p

⟹

pumping down can only affect the first and second blocks of symbols.

We can write

u w y = 0^{i} 1^{j} 0^{p} 1^{p}

.

Furthermore, the first symbol of

v w x

is

0

and the last symbol of

v w x

is

1

.

If

v \neq ϵ

, the first symbol of

v

is

0

.

If

x \neq ϵ

, the last symbol of

x

is

1

.

Since Condition (ii) is true,

v x \neq ϵ

.

One of

v

and

x

is not

ϵ

.

Pumping down will either reduce the number of

0

’s in

0^{p}

or the number of

1

’s in

1^{p}

.

Therefore, either

i < p

or

j < p

.

By Claim 2,

u w y = 0^{i} 1^{j} 0^{p} 1^{p}

is not in

L

.

Therefore, Condition (i) is not satisfied.

For cases (6) and (7), similar arguments can be made to lead to the same conclusion as in (5).

Combining all 7 cases, we conclude that

(Condition (ii) and Condition (iii))

⟹ ~

Condition (i).

Hence, by Theorem 2.64B,

L

is not context free.

References

Sipser, Michael. Introduction to the Theory of Computation, Third Edition.
Dexter C. Kozen. Automata & Computability.
John E. Hopcroft, Rajeev Motwani, Jeffrey D Ullman. Introduction to Automata Theory, Languages, & Computation, Third Edition.
Seymour Lipschutz, Marc Lars Lipson. Discrete Mathematics, Second Edition.
Kwan, Chac. A Mathematical Approach to the Theory of Finite Automata, https://figshare.com/articles/journal_contribution/A_Mathematical_Approach_to_the_Theory_of_Finite_Automata_pdf/26232644?file=47541602. [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

A Mathematical Approach to Context-Free Languages

Abstract

Keywords:

Subject:

2.1. Context-Free Grammars (CFG)

2.2. Chomsky Normal Form (CNF)

2.3. Pushdown Automata ( $P D A$ )

2.4. Equivalence of $C F G$

2.5. The Pumping Lemma for Context Free Languages

References

MDPI Initiatives

Important Links

Subscribe

A Mathematical Approach to Context-Free Languages

Abstract

Keywords:

Subject:

2.1. Context-Free Grammars (CFG)

2.2. Chomsky Normal Form (CNF)

2.3. Pushdown Automata ( P D A )

2.4. Equivalence of C F G

2.5. The Pumping Lemma for Context Free Languages

References

MDPI Initiatives

Important Links

Subscribe

2.3. Pushdown Automata ( $P D A$ )

2.4. Equivalence of $C F G$