Preprint
Article

This version is not peer-reviewed.

A Mathematical Approach to Context-Free Languages

Submitted:

30 May 2025

Posted:

03 June 2025

You are already at the latest version

Abstract
Students are getting confused and losing interest in theoretical computer science because most instructors are doing a poor job in teaching the subject matter. Instructors are doing a poor job in teaching because they do not have a well-organized theory to explain the concepts and they are unwilling to spend the time to write up better lecture notes for the class. This paper presents a rigorous mathematical approach to the theory of context-free languages which doesn’t currently exist in the literature of theoretical computer science. Basic definitions are developed in mathematical terms and used as the foundation for constructing mathematical proofs for theorems. It provides a model for instructors to write better lecture notes and authors to write better textbooks for educational purpose. It also corrects some critical errors and erroneous arguments that can be found in many textbooks which are widely used for the education of theoretical computer science. Students can use this paper for supplemental reading.
Keywords: 
;  ;  ;  ;  ;  ;  ;  

2.1. Context-Free Grammars (CFG)

In Chapter 1, we use finite automata and regular expressions to describe regular languages. In this chapter, we introduce the concept of Context-Free Grammar which is a more powerful tool for describing languages.
A Context-Free Grammar is formally defined as follows.
Definition 2.1. 
A Context-Free Grammar denoted by C F G is a 4-tuple G = ( V ,   Σ ,   R ,   S ) , where
(i)
V is a finite set of variables;
(ii)
Σ is a finite set of terminals such that V Σ = ;
(iii)
S V is the start variable; and
(iv)
R V × ( V Σ ) * is a finite relation
For any ( A , u ) R , we usually write A u and call it a rule.
Accordingly, the relation R is also called the set of rules for the C F G .
A is sometimes called the head of the rule whereas u is called the body of the rule.
Example 2.2. 
Let V = { < S E N T E N S E > , < N O U N   P H R A S E > , < V E R B   P H R A S E > , < P R E P   P H R A S E > ,
< C M P L X   N O U N > , < C M P L X   V E R B > , < P R E P > , < A R T I C L E > , < N O U N > , < V E R B > }
Σ = { a , t h e , b o y , g i r l , f l o w e r , t o u c h e s , l i k e s , s e e s , w i t h }
S = < S E N T E N S E >
Let R consist of the following rules:
< S E N T E N S E > < N O U N   P H R A S E > < V E R B   P H R A S E >
< N O U N   P H R A S E > < C M P L X   N O U N > | < C M P L X   N O U N > < P R E P   P H R A S E >
< V E R B   P H R A S E > < C M P L X   V E R B > | < C M P L X   V E R B > < P R E P   P H R A S E >
< P R E P   P H R A S E > < P R E P > < C M P L X   N O U N >
< C M P L X   N O U N > < A R T I C L E > < N O U N >
< C M P L X   V E R B > < V E R B > | < V E R B > < N O U N   P H R A S E >
< A R T I C L E > a   |   t h e
< N O U N > b o y     g i r l     f l o w e r
< V E R B > t o u c h e s     l i k e s     s e e s
< P R E P > w i t h
G = ( V ,   Σ ,   R ,   S ) is a C F G .
The following are examples of strings in Σ * that can be derived by G .
                        ( i )   < S E N T E N S E >
< N O U N   P H R A S E > < V E R B   P H R A S E >
< C M P L X   N O U N > < V E R B   P H R A S E >
< A R T I C L E > < N O U N > < V E R B   P H R A S E >
  a < N O U N > < V E R B   P H R A S E >
  a   b o y < V E R B   P H R A S E >
  a   b o y < C M P L X   V E R B >
  a   b o y < V E R B >
  a   b o y   s e e s
( i i )   < S E N T E N S E >
< N O U N   P H R A S E > < V E R B   P H R A S E >
< C M P L X   N O U N > < V E R B   P H R A S E >
< A R T I C L E > < N O U N > < V E R B   P H R A S E >
  t h e   b o y < V E R B   P H R A S E >
  t h e   b o y < C M P L X   V E R B >
  t h e   b o y < V E R B > < N O U N   P H R A S E >
  t h e   b o y   s e e s < C M P L X   N O U N >
  t h e   b o y   s e e s < A R T I C L E > < N O U N >
  t h e   b o y   s e e s   a   f l o w e r
( i i i ) < S E N T E N S E >
< N O U N   P H R A S E > < V E R B   P H R A S E >
< C M P L X   N O U N > < P R E P   P H R A S E > < V E R B   P H R A S E >
< A R T I C L E > < N O U N > < P R E P > < C M P L X   N O U N > < V E R B   P H R A S E >
  a   g i r l   w i t h < A R T I C L E > < N O U N > < V E R B   P H R A S E >
  a   g i r l   w i t h   a   f l o w e r < C M P L X   V E R B >
  a   g i r l   w i t h   a   f l o w e r < V E R B > < N O U N   P H R A S E >
  a   g i r l   w i t h   a   f l o w e r   l i k e s < C M P L X   N O U N >
  a   g i r l   w i t h   a   f l o w e r   l i k e s < A R T I C L E > < N O U N >
  a   g i r l   w i t h   a   f l o w e r   l i k e s   t h e   b o y
Definition 2.3. 
Let G = ( V ,   Σ ,   R ,   S ) be a C F G .
For any u , v ( V Σ ) * , we say u yields v (or v is derivable from u ) in one step (written as u 1 v   or simply u v ) if and only if
  A V ,   γ , α , β ( V Σ ) * and a rule A γ such that u = α A β and v = α γ β .
Note that the process of deriving v from u is basically a replacement of a variable in u by the body of the variable’s rule to obtain v .
In addition, we define u 0 v   iff u = v .
For any integer n 0 , we say u yields v (or v is derivable from u ) in n + 1 steps (written as u n + 1 v   ) iff w ( V Σ ) * such that u n w   and w 1 v   .
If there are more than one C F G to be considered, (e.g. G and G ' ) and if we need to distinguish between derivations in G from derivations in G ' , we can write
u n , G v   to mean v is derivable from u in n steps by use of rules in G ; and
u n , G ' v   to mean v is derivable from u in n steps by use of rules in G ' .
Furthermore, if we need to specify the rule to be applied in each step, we can use
u n , G , ( R 1 , R 2 , R n ) v   to mean v is derivable from u in n steps by use of rules in G with rule R i to be applied in the i t h step; and
u n , G ' , ( R 1 ' , R 2 ' , R n ' ) v   to mean v is derivable from u in n steps by use of rules in G ' with rule R i ' to be applied in the i t h step.
Since there can be more than one way of deriving a string, it is sometimes useful to require the derivation to be leftmost. A leftmost derivation is a derivation in which the leftmost variable at every step is replaced by the body of its rule.
Formally, we define leftmost derivation as follows.
For any u , v ( V Σ ) * , v is a leftmost derivation of u in one step (written as   u 1 , l m v   or simply u l m v ) iff w Σ * , w ' ( V Σ ) * , A V ,   α ( V Σ ) * and a rule A α such that u = w A w ' and v = w α w ' .
For any integer n 0 ,   u n , l m v   is defined similarly as u n v   .
Definition 2.4 
. Let G = ( V ,   Σ ,   R ,   S ) be a C F G ; * be a subset of ( V Σ ) * × ( V Σ ) * .
We define the relation * as follows:
u , v ( V Σ ) * , u * v if and only if u n v   for some integer n 0 .
n is called the length of the derivation of v from u .
Note that whenever there is an n such that u n v   , there is a minimum n ' such that u n ' v   .
If there are more than one C F G to be considered,
u * , G v if and only if u n , G v   for some integer n 0 .
* , l m   is defined similarly as *   .
Proposition 2.5. 
* (respectively * , l m ) is reflexive and transitive.
Proof. 
Since u 0 u   for all u ( V Σ ) * , u * u   for all u ( V Σ ) * .
Therefore, * is reflexive.
For transitivity, assume u * v   and v * w   .
There exist integers m 0 and n 0 such that
u m v   and v n w   .
There are two cases to examine, n = 0 or n 0
( i )   n = 0
v 0 w
By definition, v = w .
Since u m v , u m w .
Therefore, u * w   .
( i i )   n 0
v n w
v n 1 α n 1 1 w for some α n 1 ( V Σ ) * .
With a backward induction argument, we have
v α 1 α 2 α n 1 w for some α 1 , α 2 , α n 1 ( V Σ ) * .
We now have
u m v α 1 α 2 α n 1 w .
Since u m v α 1 u m + 1 α 1 ,
( u m + 1 α 1 α 2 ) u m + 2 α 2 ,
             
With a forward induction argument, we have
u m + n 1 α n 1 .
Finally, ( u m + n 1 α n 1 w ) ( u m + n w ) .
Therefore, u * w   .
Combining (i) and (ii), * is transitive.
With a similar argument, we can establish * , l m is also reflexive and transitive.
Definition 2.6. 
Let G = ( V ,   Σ ,   R ,   S ) be a C F G .
The language of G is defined as
L ( G ) = w Σ * S * w } .
Note that if S n w , n 1 because S 0 w implies S = w which is a contradiction.
Definition 2.7. 
Let G = ( V ,   Σ ,   R ,   S ) be a C F G .
Let Q represent the rule A α in R .
u , v ( V Σ ) * , we say u yields v (or v is derivable from u ) using the rule Q (written as
u Q v ) if and only if there exist w 1 , w 2 ( V Σ ) * such that u = w 1 A w 2 and v = w 1 α w 2 .
Proposition 2.8. 
Let G = ( V ,   Σ ,   R ,   S ) be a C F G .
For any A V ,   α ( V Σ ) * and x , y , z ( V Σ ) * ,
(i)
A α ( A α )
(ii)
If there is no α in ( V Σ ) * such that S α is a rule, L G = .
(iii)
If α V Σ * A α = and x y , then A appears in x A appears in y .
(iv)
Let S u 1 u 2 u n w , where u i ( V Σ ) * for all i { 1,2 , 3 , n } , w Σ * and n 1 .
If A V and A appears in u i   for some i { 1,2 , 3 , n } ,   then α ( V Σ ) * such that A α is a rule.
Proof. 
(i)
If A α is a rule, since ϵ is in ( V Σ ) * , ϵ A ϵ ϵ α ϵ .
Therefore, A α .
Conversely if A α , Q V , β , w 1 , w 2 ( V Σ ) * such that A = w 1 Q w 2 and
α = w 1 β w 2 and Q β is a rule.
Since A = w 1 Q w 2 , w 1 , w 2 must be ϵ and A = Q .
Therefore, α = β and Q β becomes A α .
(ii)
Assume for contradiction that L G .
w L G .
k N { 0 } , w 1 , w 2 , w k ( V Σ ) * such that S w 1 w 2 w k w or S w . (Note that k = 0 S = w k .)
By (i), S w 1 or S w .
This contradicts the assumption that there is no α in ( V Σ ) * such that S α is a rule.
(iii)
Since x y , B V , w 1 , w 2 , β ( V Σ ) * such that x = w 1 B w 2 , y = w 1 β w 2 and B β is a rule.
Since A α is not a rule α V Σ * , A B .
Since A appears in x , A appears either in w 1   o r   w 2 .
In either case, A appears in y .
(iv)
Assume for contradiction A which appears in u i for some i { 1,2 , 3 n } such that A α is not a rule for all α V Σ * .
By (iii), A appears in u i + 1 .
By repeated application of (iii), we can conclude that A appears in u i + 2 ,   u i + 3 , u n   and w , which is a contradiction because w contains no variables.
Example 2.9. 
Let G = ( { S } ,   { 0,1 } ,   R ,   S ) be a C F G .
Create the rules in R so that L ( G ) = 0 n 1 2 n + 1 n N } .
The rule is S 0 S 11   |   1 as can be seen from the following applications of the rule.
S 0 S 11 (1st application of S 0 S 11 )
00 S 1111 (2nd application of S 0 S 11 )
000 S 111111 (3rd application of S 0 S 11 )
             
             
0 n S 1 2 n ( n t h application of S 0 S 11 )
0 n 1 2 n + 1 (Application of S 1 )
Example 2.10. 
Let G = ( { S } ,   { 0,1 } ,   R ,   S ) be a C F G .
Create the rules in R so that L ( G ) = 0 2 n 1 3 n n N } .
The rule is S 00 S 111   |   ϵ as can be seen from the following applications of the rule.
S 00 S 111 (1st application of S 00 S 111 )
0000 S 111111 (2nd application of S 00 S 111 )
000000 S 111111111 (3rd application of S 00 S 111 )
             
             
0 2 n S 1 3 n ( n t h application of S 00 S 111 )
0 2 n ϵ 1 3 n (Application of S ϵ )
                              0 2 n 1 3 n
Example 2.11. 
Let G = ( { S } ,   { 0,1 } ,   R ,   S ) be a C F G .
Create the rules in R so that L ( G ) = 0 2 n + 7 1 3 n + 9 n N } .
The rule is S 00 S 111   |   0 7 1 9 as can be seen from the following applications of the rule.
S 00 S 111 (1st application of S 00 S 111 )
0000 S 111111 (2nd application of S 00 S 111 )
000000 S 111111111 (3rd application of S 00 S 111 )
             
             
0 2 n S 1 3 n ( n t h application of S 00 S 111 )
0 2 n 0 7 1 9 1 3 n (Application of S 0 7 1 9 )
                              0 2 n + 7 1 3 n + 9
Definition 2.12. Let G = ( V ,   Σ ,   R ,   S ) be a C F G .
Let R 1 , R 2 , R 3 , R n   a n d   Q be rules in R where n 1 .
R 1 , R 2 , R 3 , R n and Q are equivalent if u V Σ * , there exists v V Σ * such that u n , ( R 1 , R 2 , R n ) v if and only if u Q v
Proposition 2.13. 
(i)
A A α α if and only if A α is a rule.
(ii)
If A does not appear in α and A does not appear in β and A α is a rule, then
α A β A γ x   i f f   x = α γ β .
Proof. 
(i)
If A α is a rule, ϵ A ϵ A α ϵ α ϵ and therefore, A A α α .
(ii)
If x = α γ β , since A γ , by definition α A β A γ α γ β . Therefore, α A β A γ x .
Conversely, if A A α α , by definition A α is a rule.
Conversely, if α A β A γ x , u 1 , u 2 V Σ * such that α A β = u 1 A u 2 and x = u 1 γ u 2 .
Since A does not appear in α and A does not appear in β , there is only one appearance of A in α A β .
Therefore, there is only one appearance of A in u 1 A u 2 .
Therefore, α A β = u 1 A u 2 α = u 1   &   β = u 2 .
Therefore, x = α γ β .
Proposition 2.14. 
( A α ) & ( B β ) are equivalent if and only if ( A = B ) & ( α = β ) .
Proof. 
If ( A α ) & ( B β ) are equivalent,
A α ( A A α α ) (Proposition 2.13)
Since ( A α ) & ( B β ) are equivalent, A B β α .
There exist w 1 , w 2 ( V Σ ) * such that A = w 1 B w 2 and α = w 1 β w 2 .
A = w 1 B w 2 A = B since A and B are both variables.
A = w 1 A w 2 w 1 = w 2 = ϵ .
Therefore, α = β .
Conversely, if ( A = B ) & α = β , ( A α ) & ( B β ) are the same rule and hence they are equivalent.
Proposition 2.15. 
Let G = ( V ,   Σ ,   R ,   S ) be a C F G .
A , B V and x , y , z ( V Σ ) * , A x B z   &   ( B y ) are equivalent to A x y z .
Proof. u ( V Σ ) * , let
R 1 be A x B z
R 2 be B y
R 3 be A x y z If u 2 , ( R 1 , R 2 ) v , w 1 ( V Σ ) * such that u R 1 w 1 R 2 v Since u R 1 w 1 , u = α 1 A α 2 and w 1 = α 1 x B z α 2 for some α 1 , α 2 ( V Σ ) * .
Since R 2 = ( B y ) , α 1 x B z α 2 R 2 α 1 x y z α 2 .
That is, w 1 R 2 α 1 x y z α 2 .
Therefore, v = α 1 x y z α 2 such that u 2 , ( R 1 , R 2 ) v .
Since R 3 = ( A x y z ) , α 1 A α 2 R 3 α 1 x y z α 2 .
That is, u R 3 α 1 x y z α 2 .
Therefore, u R 3 v .
Conversely, if u R 3 v , u = α 1 A α 2 and v = α 1 x y z α 2 for some α 1 , α 2 ( V Σ ) * .
Let w 1 = α 1 x B z α 2 .
Since R 1 = ( A x B z ) , α 1 A α 2 R 1 α 1 x B z α 2 .
Since R 2 = ( B y ) , α 1 x B z α 2 R 2 α 1 x y z α 2 .
Therefore, α 1 A α 2 R 1 α 1 x B z α 2 R 2 α 1 x y z α 2 .
That is, u R 1 w 1 R 2 v .
That is, u 2 , ( R 1 , R 2 ) v Combining both directions, ( R 1 , R 2 ) and R 3 are equivalent.
Proposition 2.16. Let G = ( V ,   Σ ,   R ,   S ) be a C F G .
(a)
A , B V , x , y , z ( V Σ ) * , if A x B z   &   B y then A * x y z .
(b)
α , β , γ , β ' ( V Σ ) * , if β β ' then α β γ α β ' γ .
(c)
α , β , γ , β ' ( V Σ ) * , if β * β ' then α β γ * α β ' γ .
(d)
Let α 1 , α 2 , α n , β 1 , β 2 , β n , γ 1 , γ 2 , γ n ( V Σ ) * .
If β i * γ i for i { 1,2 , 3 , n } , then α 1 β 1 α 2 β 2 α 3 β 3 α n β n * α 1 γ 1 α 2 γ 2 α 3 γ 3 α n γ n .
In the special case of α 1 = α 2 , = α n = ϵ , β 1 β 2 β 3 β n * γ 1 γ 2 γ 3 γ n .
Proof. 
(a)
By Proposition 2.8 (i), ( B y ) B y .
By definition of derivation, x B z x y z .
Therefore, A x B z x y z .
Therefore, A * x y z .
(b)
Since β β ' , β 1 , β 2 ( V Σ ) * , A V , η ( V Σ ) * and a rule A η such that
β = β 1 A β 2 , β ' = β 1 η β 2 .
Therefore, α β γ = α β 1 A β 2 γ and α β ' γ = α β 1 η β 2 γ .
Therefore, α β γ α β ' γ .
(c)
Since β * β ' , u 1 , u 2 u n ( V Σ ) * where n 0 such that
β u 1 u 2 u 3 u n 1 u n β ' .
α β γ α u 1 γ ( β u 1 & (b))
α u 2 γ ( u 1 u 2 & (b))
             
α u n γ ( u n 1 u n & (b))
α β ' γ ( u n β ' . & (b))
Therefore, α β γ * α β ' γ .
(d)
α 1 β 1 α 2 β 2 α 3 β 3 α n β n * α 1 γ 1 α 2 β 2 α 3 β 3 α n β n ( β 1 * γ 1 & (c))
* α 1 γ 1 α 2 γ 2 α 3 β 3 α n β n ( β 2 * γ 2 & (c))
* α 1 γ 1 α 2 γ 2 α 3 γ 3 α n β n ( β 3 * γ 3 & (c))
             
* α 1 γ 1 α 2 γ 2 α 3 γ 3 α n γ n ( β n * γ n & (c))
Therefore, α 1 β 1 α 2 β 2 α 3 β 3 α n β n * α 1 γ 1 α 2 γ 2 α 3 γ 3 α n γ n .
This completes the proof of Proposition 2.16.
By replacing with l m and * with * , l m , we have the following proposition.
Proposition 2.17. 
Let G = ( V ,   Σ ,   R ,   S ) be a C F G .
(a)
A , B V , x , y , z ( V Σ ) * , if A l m x B z   &   B l m y then A * , l m x y z .
(b)
α , β , γ , β ' ( V Σ ) * , if β l m β ' then α β γ l m α β ' γ .
(c)
α , β , γ , β ' ( V Σ ) * , if β * , l m β ' then α β γ * , l m α β ' γ .
(d)
Let α 1 , α 2 , α n , β 1 , β 2 , β n , γ 1 , γ 2 , γ n ( V Σ ) * .
If β i * , l m γ i for i { 1,2 , 3 , n } , then α 1 β 1 α 2 β 2 α 3 β 3 α n β n * , l m α 1 γ 1 α 2 γ 2 α 3 γ 3 α n γ n .
In the special case of α 1 = α 2 , = α n = ϵ , β 1 β 2 β 3 β n * , l m γ 1 γ 2 γ 3 γ n .
Proposition 2.18. 
Let G = ( V ,   Σ ,   R ,   S ) be a C F G .
A , A 1 , A 2 , A n V , α 1 , α 2 , α n , β 1 , β 2 , β n ( V Σ ) * .
If A α 1 A 1 β 1 , A 1 α 2 A 2 β 2 , A n 1 α n A n β n , then A * α 1 α n 1 α n A n β n β n 1 β 1 .
Proof. 
The proof is by induction on n .
( n = 1 )
If A α 1 A 1 β 1 , by Proposition 2.8 (i), A α 1 A 1 β 1 .
Therefore, A * α 1 A 1 β 1 .
( n = k + 1 , k 1 )
Assume A α 1 A 1 β 1 , A 1 α 2 A 2 β 2 , A k 1 α k A k β k , A k α k + 1 A k + 1 β k + 1 .
By induction hypothesis, A * α 1 α k 1 α k A k β k β k 1 β 1 .
Since A k α k + 1 A k + 1 β k + 1 , by definition of derivation,
α 1 α k 1 α k A k β k β k 1 β 1 α 1 α k 1 α k α k + 1 A k + 1 β k + 1 β k β k 1 β 1 .
Therefore, A * α 1 α k 1 α k A k β k β k 1 β 1 α 1 α k 1 α k α k + 1 A k + 1 β k + 1 β k β k 1 β 1 .
Therefore, A * α 1 α k 1 α k α k + 1 A k + 1 β k + 1 β k β k 1 β 1 .
This completes the proof of Proposition 2.18.
Proposition 2.19. 
Let G = ( V ,   Σ ,   R ,   S ) be a C F G , B V , R 1 ,   R 2 ,   R 3 ,   R n   be rules in R where n 0 .
Let α 1 , α 2 , α 1 ' , α 2 ' ( V Σ ) * .
If α 1 B α 2 n , ( R 1 , R 2 , R n ) α 1 ' B α 2 ' then α 1 x α 2 n , ( R 1 , R 2 , R n ) α 1 ' x α 2 ' x ( V Σ ) * where the two B s in the two strings are the same B (Note that there can be more than one B in the string α 1 B α 2 ) and B is not the head of any rule R i ( i = 1,2 , n ) .
(Note that when n = 0 , the statement becomes
( α 1 B α 2 = α 1 ' B α 2 ' ) ( α 1 x α 2 = α 1 ' x α 2 ' x ( V Σ ) * )
Proof. 
For n = 0 , α 1 B α 2 = α 1 ' B α 2 '
Since the two B s in the two strings are the same B , replacing them with x must yield two equal strings.
Therefore, α 1 x α 2 = α 1 ' x α 2 ' .
Therefore, the statement is true for n = 0 .
For n = 1 , if α 1 B α 2 R 1 α 1 ' B α 2 ' ,
Let A α be the rule represented by R 1 .
By definition of yielding, u 1 , u 2 ( V Σ ) * such that
α 1 B α 2 = u 1 A u 2 and α 1 ' B α 2 ' = u 1 α u 2 .
The B that appears in α 1 B α 2 must also appear in u 1 A u 2 .
Since R 1 does not originate from this particular B , A and B cannot be the same object in the string α 1 B α 2   or u 1 A u 2 and hence there are only two cases to examine: B appears in u 1 or B appears in u 2 .
(i)
If B appears in u 1
Let u 1 ' be the string obtained by replacing B in u 1 with x .
Since α 1 B α 2 = u 1 A u 2 , replacing B with x on both sides would yield two equal strings.
That is, α 1 x α 2 = u 1 ' A u 2 .
Since α 1 ' B α 2 ' = u 1 α u 2 , replacing B with x on both sides would yield two equal strings.
That is, α 1 ' x α 2 ' = u 1 ' α u 2 .
However, u 1 ' A u 2 R 1 u 1 ' α u 2 since A α is a rule.
Therefore, α 1 x α 2 R 1 α 1 ' x α 2 ' .
Therefore, the statement is true for n = 1 .
(ii)
If B appears in u 2 , with a similar argument, we can show that the statement is also true for n = 1 .
With the results established on n = 0 and n = 1 and an induction argument, we can conclude that for n 0 ,
( α 1 B α 2 n , R 1 , R 2 , R n α 1 ' B α 2 ' ) ( α 1 x α 2 n , ( R 1 , R 2 , R n ) α 1 ' x α 2 '   x ( V Σ ) * )
Proposition 2.20. 
If G = ( V ,   Σ ,   R ,   S ) is a C F G and there exist u 1 , u 2 , u k ( V Σ ) * , w Σ * such that
S u 1 u 2 u 3 u r u k w , then
The # of variables in u r the # of steps remaining from u r to w .
Proof. Let n be the number of steps remaining from u r to w .
n = k + 1 r
We’ll prove this proposition by induction on n .
(For n = 1 )
r = k
Therefore, u r = u k w .
Since u k w , α , β , γ ( V Σ ) * and a rule A γ such that
u k = α A β and w = α γ β .
Since w Σ * , α , β Σ * .
Therefore, u k has only one variable.
Therefore, u r has only one variable.
Therefore, # of variables in u r the # of steps remaining from u r to w .
(For induction)
The # of steps remaining from u r 1 to w is n + 1 = k + 2 r .
Since u r 1 u r , α , β , γ ( V Σ ) * and a rule A γ such that
u r 1 = α A β and u r = α γ β .
Let m be the number of variables in u r 1 .
# of variables in u r = m 1 + (# of variables in γ ) m 1 .
By induction hypothesis, # of variables in u r n = k + 1 r .
Therefore, m 1 # of variables in u r n = k + 1 r .
Therefore, m 1 k + 1 r .
Therefore, m k + 2 r .
Therefore, number of variables in u r 1 # of steps remaining from u r 1 to w .
This completes the proof of Proposition 2.20.
Example 2.21. 
Let G = ( V ,   Σ ,   R ,   S ) be a C F G and there exist u 1 , u 2 , u k ( V Σ ) * , w Σ * such that
S u 1 u 2 u 3 u r u k w . Show that the statement
(# of variables in u r = the # of steps remaining from u r to w ) is not always true.
(Hint: Consider V = S , A , B , C , Σ = a , b , c , R = { S A B , A C , C c , B b } and
S S A B A B A C C B C c c B B b c b .)
Definition 2.22. 
α , β ( V Σ ) * ,   α is a substring of β (written as α β ) if α ' , α ' ' ( V Σ ) * such that
β = α ' α α ' ' . α ' is called the left complement of α in β , written as L C ( α ) . α ' ' is called the right complement of α in β , written as R C ( α ) .
Proposition 2.23. 
For any strings α 1 , α 2 , u such that α 1 , α 2 u , if α 1 α 2 , then
(i)
L C ( α 2 ) L C ( α 1 ) & L C α 2 r = L C ( α 1 ) for some string r
(ii)
R C ( α 2 ) R C ( α 1 ) & l R C α 2 = R C ( α 1 ) for some string l .
Proof. 
α 1 u u = x 1 α 1 y 1 for some strings x 1 , y 1 .
α 2 u u = x 2 α 2 y 2 for some strings x 2 , y 2 .
x 1 = L C ( α 1 ) ; y 1 = R C ( α 1 ) .
x 2 = L C ( α 2 ) ; y 2 = R C ( α 2 ) .
α 1 α 2 α 2 = r α 1 l for some strings r & l .
Therefore, u = x 2 α 2 y 2 = x 2 r α 1 l y 2 .
Since u = x 1 α 1 y 1 , x 1 α 1 y 1 = x 2 r α 1 l y 2 .
Therefore, x 1 = x 2 r and y 1 = l y 2 .
Therefore, L C α 1 = L C ( α 2 ) r and R C α 1 = l R C ( α 2 ) .
Therefore, L C ( α 2 ) L C ( α 1 ) and R C ( α 2 ) R C ( α 1 ) .
This completes the proof of Proposition 2.23.
Definition 2.24. 
For any strings α 1 , α 2 , u such that α 1 , α 2 u , α 1 is said to be on the left of α 2 if there exist strings x ,   y ,   z such that u = x α 1 y α 2 z .
Proposition 2.25. 
Let G = ( V ,   Σ ,   R ,   S ) be a C F G .
Let u 0 u 1 u 2 u 3 u n , where u 0 , u 1 , u 2 , u 3 u n ( V Σ ) * .
Let 0 i < j n .
If α i u i , then α i + 1 , α i + 2 , α j where α i + 1 u i + 1 , α i + 2 u i + 2 , α j u j such that
α i λ 1 α i + 1 λ 2 α i + 2 λ 3 α i + 3 λ j i α j where λ 1 , λ 2 , λ j i { 0,1 } .
Hence, α i * α j in no more than j i steps.
α j is called the ( j i ) -step expansion of α i within the derivation of u 0 * u n and it is written as α j = E x p a n ( α i , j i ) .
Proof. Let k = j i .
1 k n This proposition can be proved by induction on k .
( k = 1 ):
j = i + 1
Since u i u i + 1 , α , β , γ ( V Σ ) * and a rule A γ such that
u i = α A β and u i + 1 = α γ β .
Since α i u i , α ' , β ' ( V Σ ) * such that u i = α ' α i β ' .
(i)
If A α i
α ' ' , β ' ' ( V Σ ) * such that α i = α ' ' A β ' ' .
u i = α ' α i β ' = α ' α ' ' A β ' ' β ' .
Also, u i = α A β .
Therefore, α A β = α ' α ' ' A β ' ' β ' .
Therefore, α = α ' α ' ' and β = β ' ' β ' .
Since u i + 1 = α γ β , u i + 1 = α ' α ' ' γ β ' ' β ' .
Take α i + 1 = α ' ' γ β ' ' .
Since α i = α ' ' A β ' ' and A γ is a rule, α i α i + 1 .
(ii)
If A is not a substring of α i Since u i = α A β and α i u i , either α i α or α i β .
Since u i + 1 = α γ β , α u i + 1 and β u i + 1 .
Therefore ( α i α or α i β ) α i u i + 1 .
Take α i + 1 = α i .
Therefore, α i 0 α i + 1 .
Combining (i) and (ii), α i λ 1 α i + 1 where λ 1 { 0,1 } .
(Induction):
By induction assumption,
α i λ 1 α i + 1 λ 2 α i + 2 λ 3 α i + 3 λ j i α j where λ 1 , λ 2 , λ j i 0,1 and
α i + 1 u i + 1 , α i + 2 u i + 2 , α j u j .
Since α j u j and u j u j + 1 , by applying the same argument as in the case of ( k = 1 ), we can find α j + 1 u j + 1 such that α j λ j i + 1 α j + 1 where λ j i + 1 0,1 .
We now have α i λ 1 α i + 1 λ 2 α i + 2 λ 3 α i + 3 λ j i α j λ j i + 1 α j + 1 where
λ 1 , λ 2 , λ j i , λ j i + 1 0,1 and α i + 1 u i + 1 , α i + 2 u i + 2 , α j u j , α j + 1 u j + 1 .
This completes the proof of Proposition 2.25.
Proposition 2.26. 
Let G = ( V ,   Σ ,   R ,   S ) be a C F G .
Let u 0 u 1 u 2 u 3 u n , where u 0 , u 1 , u 2 , u 3 u n ( V Σ ) * .
Let 0 i < j n , α i u i , α i ' u i , α j = E x p a n ( α i , j i ) & α j ' = E x p a n ( α i ' , j i ) .
If α i is to the left of α i ' within u i , then α j is to the left of α j ' within u j .
Proof. 
Let k = j 1 .
0 k n .
We can prove this proposition by induction on k .
( k = 0 )
j = i α i = α j & α i ' = α j ' .
Therefore, α i 0 α j & α i ' 0 α j ' .
α j = E x p a n ( α i , 0 ) and α j ' = E x p a n ( α i ' , 0 ) .
α i is to the left of α i ' α j is to the left of α j ' .
The statement is true for k = 0 .
(Induction)
Induction Hypothesis:
( α i is to the left of α i ' ) ( α i + k is to the left of α i + k ' ) .
Since α i + k is to the left of α i + k ' , there exist x ,   y ,   z   ( V Σ ) * such that
u i + k = x α i + k y α i + k ' z .
Since u i + k u i + k + 1 , there exists a rule A γ such that
u i + k = α A β & u i + k + 1 = α γ β .
We now have five situations to examine: A x , A α i + k , A y , A α i + k ' , A z .
(i)
A x
R C A = l R C ( x ) (Proposition 2.23)
β = l α i + k y α i + k ' z ( R C A = β , R C x = α i + k y α i + k ' z )
u i + k + 1 = α γ β = α γ l α i + k y α i + k ' z .
Take α i + k + 1 = α i + k and α i + k + 1 ' = α i + k ' .
Therefore, α i + k 0 α i + k + 1 & α i + k ' 0 α i + k + 1 ' .
In addition, u i + k + 1 = α γ l α i + k + 1 y α i + k + 1 ' z .
Therefore, α i + k + 1 is to the left of α i + k + 1 ' .
(ii)
A α i + k
α ' , β ' ( V Σ ) * such that
α i + k = α ' A β '
u i + k = x α i + k y α i + k ' z = x α ' A β ' y α i + k ' z Since u i + k = α A β , α = x α ' & β = β ' y α i + k ' z .
Therefore, u i + k + 1 = α γ β = x α ' γ β ' y α i + k ' z .
Take α i + k + 1 = α ' γ β ' & α i + k + 1 ' = α i + k ' .
Now, u i + k + 1 = x α i + k + 1 y α i + k + 1 ' z .
So, α i + k + 1 is to the left of α i + k + 1 ' .
In addition, α i + k α i + k + 1 because α i + k = α ' A β ' & α i + k + 1 = α ' γ β ' .
Also, α i + k ' 0 α i + k + 1 ' because α i + k + 1 ' = α i + k ' .
(iii)
A y
With a similar argument as in (i), we can show that α i + k + 1 , α i + k + 1 ' in u i + k + 1 such that α i + k λ α i + k + 1 & α i + k ' λ ' α i + k + 1 ' where λ , λ ' { 0,1 } and
α i + k + 1 is to the left of α i + k + 1 ' .
(iv)
A α i + k '
With a similar argument as in (ii), we can show that α i + k + 1 , α i + k + 1 ' in u i + k + 1 such that α i + k λ α i + k + 1 & α i + k ' λ ' α i + k + 1 ' where λ , λ ' { 0,1 } and
α i + k + 1 is to the left of α i + k + 1 ' .
(v)
A z
With a similar argument as in (i), we can show that α i + k + 1 , α i + k + 1 ' in u i + k + 1 such that α i + k λ α i + k + 1 & α i + k ' λ ' α i + k + 1 ' where λ , λ ' { 0,1 } and
α i + k + 1 is to the left of α i + k + 1 ' .
Combining all (i) to (v) and the induction hypothesis, we now have:
If α i is to the left of α i ' within u i , then α i + k + 1 is to the left of α i + k + 1 ' within u i + k + 1 .
This completes the proof of Proposition 2.26.
Proposition 2.27. 
Let G = ( V ,   Σ ,   R ,   S ) be a C F G .
Let u 0 u 1 u i u i + 1 u n , where u 0 , u 1 , u i , u i + 1 u n ( V Σ ) * & 0 i < n .
Let α i , β i ( V Σ ) * .
If α i β i u i , then E x p a n α i β i , 1 = E x p a n α i , 1 E x p a n β i , 1 .
Proof. 
Since α i β i u i ,
u i = x α i β i y for some x , y ( V Σ ) * .
Since u i u i + 1 , α , β , γ ( V Σ ) * and a rule A γ such that
u i = α A β & u i + 1 = α γ β .
Since A u i & u i = x α i β i y , we have four cases to examine:
A x , A α i , A β i , A y .
(i)
A x
x = x ' A y ' for some x ' , y ' ( V Σ ) * .
u i = x α i β i y = x ' A y ' α i β i y .
Since u i is also equal to α A β , α A β = x ' A y ' α i β i y .
Therefore, α = x ' and β = y ' α i β i y .
Since u i + 1 = α γ β , u i + 1 = x ' γ y ' α i β i y .
Now we have α i , β i , α i β i u i & α i , β i , α i β i u i + 1 .
In addition, α i 0 α i , β i 0 β i , α i β i 0 α i β i .
Therefore, E x p a n α i β i , 1 = α i β i , E x p a n α i , 1 = α i and E x p a n β i , 1 = β i .
Therefore, E x p a n α i β i , 1 = E x p a n α i , 1 E x p a n β i , 1 .
(ii)
A α i
α i = x ' A y ' for some x ' , y ' ( V Σ ) * .
Since u i = x α i β i y , u i = x x ' A y ' β i y .
Since u i = α A β , α A β = x x ' A y ' β i y .
Therefore, α = x x ' and β = y ' β i y .
Since u i + 1 = α γ β , u i + 1 = x x ' γ y ' β i y .
Let α i + 1 = x ' γ y ' . Then u i + 1 = x α i + 1 β i y .
Since α i = x ' A y ' and A γ is a rule, α i α i + 1 .
Since α i u i and α i + 1 u i + 1 , E x p a n α i , 1 = α i + 1 .
Since β i u i and β i u i + 1 and β i 0 β i , E x p a n β i , 1 = β i .
Since α i = x ' A y ' , α i β i = x ' A y ' β i .
x ' A y ' β i x ' γ y ' β i because A γ is a rule.
Therefore, α i β i x ' γ y ' β i .
Therefore, α i β i α i + 1 β i ( α i + 1 = x ' γ y ' )
Since α i β i u i and α i + 1 β i u i + 1 , E x p a n α i β i , 1 = α i + 1 β i .
Therefore, E x p a n α i β i , 1 = E x p a n α i , 1 E x p a n β i , 1 .
(iii)
A β i
With a similar argument as in (ii), we can show that E x p a n α i β i , 1 = E x p a n α i , 1 E x p a n β i , 1 .
(iv)
A y
With a similar argument as in (i), we can show that
E x p a n α i β i , 1 = E x p a n α i , 1 E x p a n β i , 1 .
This completes the proof of Proposition 2.27.
Proposition 2.28. 
Let G = ( V ,   Σ ,   R ,   S ) be a C F G .
Let u 0 u 1 u i u n , where u 0 , u 1 , u i , u n ( V Σ ) * & 0 i n .
(i)
For 0 k n i , E x p a n α i 1 α i 2 α i m , k = E x p a n α i 1 , k E x p a n α i 2 , k E x p a n α i m , k where
α i 1 α i 2 α i m u i .
(ii)
If u 0 = X 1 X 2 X m where X 1 , X 2 , X m V Σ & u n = w Σ * , then w 1 , w 2 w m Σ * such that X i * w i in no more than n steps &
w = w 1 w 2 w m .
Proof. 
Claim.
α i , β i ( V Σ ) * such that α i β i u i and 0 k n i ,
E x p a n α i β i , k = E x p a n α i , k E x p a n β i , k
This Claim can be proved by induction on k .
( k = 0 )
E x p a n α i β i , 0 = α i β i .
E x p a n α i , 0 = α i and E x p a n β i , 0 = β i .
Therefore, E x p a n α i β i , 0 = E x p a n α i , 0 E x p a n β i , 0 .
The statement is true for k = 0 .
(Induction)
Induction Hypothesis:
E x p a n α i β i , k = E x p a n α i , k E x p a n β i , k where
E x p a n α i β i , k , E x p a n α i , k , E x p a n β i , k u i + k .
E x p a n α i β i , k + 1 = E x p a n ( E x p a n α i β i , k , 1 )
= E x p a n ( E x p a n α i , k E x p a n β i , k , 1 ) (Induction Hypothesis)
= E x p a n ( E x p a n α i , k , 1 ) E x p a n ( E x p a n β i , k , 1 ) (Proposition 2.27)
= E x p a n α i , k + 1 E x p a n β i , k + 1 This completes the proof of Claim.
The proof of (i) is by induction on m .
( m = 1 )
L H S = E x p a n α i 1 , k .
R H S = E x p a n α i 1 , k .
Therefore, the statement is true for m = 1 .
(Induction)
Induction Hypothesis: E x p a n α i 1 α i 2 α i m , k = E x p a n α i 1 , k E x p a n α i 2 , k E x p a n α i m , k .
E x p a n α i 1 α i 2 α i m α i m + 1 , k = E x p a n α i 1 α i 2 α i m , k E x p a n α i m + 1 , k (Claim)
= E x p a n α i 1 , k E x p a n α i 2 , k E x p a n α i m , k E x p a n α i m + 1 , k (Induction Hypothesis)
This completes the proof of (i).
(i)
Set i = 0 & k = n for the result in (i).
α 0 1 = X 1 , α 0 2 = X 2 , α 0 m = X m .
u 0 = X 1 X 2 X m = α 0 1 α 0 2 α 0 m Therefore, α 0 1 α 0 2 α 0 m u 0 .
By (i), E x p a n α 0 1 α 0 2 α 0 m , n = E x p a n α 0 1 , n E x p a n α 0 2 , n E x p a n α 0 m , n Therefore, E x p a n X 1 X 2 X m , n = E x p a n X 1 , n E x p a n X 2 , n E x p a n X m , n .
E x p a n X 1 X 2 X m , n = E x p a n u 0 , n = u n = w .
Therefore, E x p a n X 1 , n E x p a n X 2 , n E x p a n X m , n = w .
Therefore, E x p a n X i , n = w i for some w i Σ * , i { 1,2 , n } .
Therefore, w = w 1 w 2 w m &
X i * w i in no more than n steps.
Proposition 2.29. 
Let G = ( V ,   Σ ,   R ,   S ) be a C F G , α , β ( V Σ ) * , X V and w Σ * .
If α X β * w , then X * w ' for some w ' Σ * .
Proof. 
n 1 such that α X β n w .
E x p a n α X β , n = w .
E x p a n α , n E x p a n X , n E x p a n β , n = w (Proposition 2.28)
E x p a n X , n = w ' for some w ' Σ *
X * w ' for some w ' Σ * (Proposition 2.25)
This completes the proof of Proposition 2.29.
Example 2.30. Prove that the non-regular set A = { a n b n |   n 0 } is a C F L .
Proof. 
Let G = ( V ,   Σ ,   R ,   S ) be a C F G such that
V = { S } , Σ = { a , b } , R = { S a S b , S ϵ } .
In short form, S a S b | ϵ .
Claim 1. 
If S n + 1 α   n 0 where α ( V Σ ) * , then
γ ( V Σ ) * such that S n γ and γ 1 α and γ = a n S b n .
Claim 1 can be proved by induction on n .
For n = 0 , if S 1 α , by definition, γ ( V Σ ) * such that S 0 γ and γ 1 α .
Therefore, S = γ .
Therefore, γ = a 0 S b 0 .
Therefore, the statement is true for n = 0 .
Assume the statement is true for n = k for k 0 .
That is, S k + 1 α γ ( V Σ ) * s u c h t h a t S k γ & γ 1 α & γ = a k S b k for k 0 and α ( V Σ ) * .
For n = k + 1 , assume S k + 2 α .
By definition, γ ' ( V Σ ) * such that
S k + 1 γ ' and γ ' 1 α .
By induction assumption, γ ( V Σ ) * such that S k γ and γ 1 γ ' and γ = a k S b k .
Since there are only two rules in R , namely S a S b   o r   S ϵ .
If we use S ϵ on γ 1 γ ' , then a k S b k S ϵ γ ' .
By Proposition 2.13 (ii), γ ' = a k ϵ b k = a k b k .
This contradicts the conclusion γ ' 1 α we derive above because a k b k does not contain a variable.
Therefore, we must use rule S a S b .
Therefore, γ S a S b γ ' .
Therefore, a k S b k S a S b γ ' .
Again by Proposition 2.13 (ii), γ ' = a k a S b b k = a k + 1 S b k + 1 .
This completes the proof of Claim 1.
Claim 2. 
S n + 1 a n b n   n 0 .
For n = 0 , S S ϵ ϵ by Proposition 2.13 (i).
Therefore, S 1 a 0 b 0 and hence the statement is true for n = 0 .
For n 1 , by Proposition 2.13 (i) & (ii),
S S a S b a S b S a S b a 2 S b 2 S a S b a 3 S b 3 S a S b a n S b n .
Therefore, S n a n S b n .
In addition, a n S b n S ϵ a n b n by Proposition 2.13 (ii).
Therefore, S n + 1 a n b n .
This completes the proof of Claim 2.
It remains to show that L G = A .
u A u = a n b n
S n + 1 u (by Claim 2)
u L ( G ) Conversely, if u L ( G ) , u Σ * and
S n + 1 u for some n 0 .
γ ( V Σ ) * such that S n γ and γ 1 u and γ = a n S b n by Claim 1.
Since there are only two rules in R , either γ S a S b u or γ S ϵ u .
γ S a S b u
a n S b n S a S b u
a n + 1 S b n + 1 = u (Proposition 2.13 (ii))
a contradiction to u Σ * .
Therefore, we must use γ S ϵ u .
Therefore, a n S b n S ϵ u .
a n ϵ b n = u by Proposition 2.13 (ii).
Therefore, u = a n b n and hence u A .
Combining both directions, L G = A .
Before proceeding to the proof of some important theorems in C F G , we need to review some Tree terminology and Graph Theory. The readers are assumed to have some background in the subject matter and the following are stated without proof.
T1. A tree is a directed acyclic graph (DAG).
T2. Trees are collections of nodes and edges.
T3. If ( A ,   B ) is the directed edge from node A to node B , A is called the parent and B is called the child.
T4. A node has at most one parent, drawn above the node and zero or more children, drawn below.
T5. There is one node that has no parent. This node is called the root and appears at the top of the tree. Nodes that have no children are called leaves. Nodes that are not leaves are called interior nodes.
T6. A simple directed path from v 0 to v n is represented by ( v 0 ,   v 1 ,   v 2 , v n ) where ( v i , v i + 1 ) with i { 0,1 , 2 , n 1 } are directed edges joining the nodes, v 0 ,   v 1 ,   v 2 , v n of the tree and v i v j for i j . The length of the simple directed path is equal to the number of directed edges connecting the nodes v 0 ,   v 1 ,   v 2 , v n and is equal to n in this case.
T7. For any two nodes A and B , if there is a simple directed path from A to B , B is a descendant of A and A is the ancestor of B . Since every simple directed path from A to B must pass through a child of A , there is simple directed path from one of A s children to B .
T8. There is a unique simple directed path from the root to any other node.
T9. Let d r , l = the length of the path from the root r to a leaf   l . The height of the tree is defined as h = M a x { d ( r , l ) | r = r o o t ; l = a   l e a f } . Therefore, the height of a tree is the longest path from the root to a leaf.
T10. The length of the path from the root to a node v is called the level of v .
T11. The simple directed path from an interior node to a leaf is called a branch. The combination of all branches is the largest subtree with the interior node as the root. The length of any branch is no longer than the height of the subtree which in turn is no longer than the height of the parent tree.
T12. The children of a node are ordered from left to right. If node A is to the left of node B , then all the descendants of A are to be to the left of all the descendants of B at the same level.
T13. A subtree is a tree of which the vertices and edges are also the vertices and edges of the parent tree. If a subtree has a leaf, the leaf is also a leaf of the parent tree.
Definition 2.31. 
For any context-free grammar, G = ( V ,   Σ ,   R ,   S ) , a parse tree for G is a tree that satisfies the following conditions:
(i)
Each interior node is labeled as a variable in V .
(ii)
Each leaf is labeled either as a variable in V , a terminal in Σ or ϵ .
(iii)
If an interior node labeled A (a variable) has children X 1 , X 2 , X 3 , X n where X i V Σ for i { 1,2 , n } , then
A X 1 X 2 X 3 X n is a rule in R .
(iv)
If an interior node labeled A (a variable) has ϵ as a child, then ϵ is the only child of A and A ϵ is a rule in R .
Note that any subtree of a parse tree is also a parse tree.
Definition 2.32. The yield of a parse tree is the concatenation of all the leaves of the tree from left to right.
Theorem 2.33. Let G = ( V ,   Σ ,   R ,   S ) be a C F G . The following statements are equivalent.
(i)
a parse tree with root A V and a yield w Σ * .
(ii)
A * , l m w , w Σ * .
(iii)
A * w , w Σ * .
Proof. “(i)   (ii)”
This can be proved by an induction on the height of the tree in statement (i).
Let h   ( 1 ) be the height of the parse tree in statement (i).
h = 1
The parse tree looks like the following figure.
Figure 2.1. Caption.
Figure 2.1. Caption.
Preprints 161810 g001
By definition of parse tree, A X 1 X 2 X 3 X n is a rule in R .
By Proposition 2.8(i), A X 1 X 2 X 3 X n .
Therefore, A * X 1 X 2 X 3 X n .
The yield of this tree is X 1 X 2 X 3 X n which is equal to w by statement (i).
Therefore, A * w .
Since A is the only variable in the string A , it is therefore also the leftmost variable in the string A .
Therefore, A * , l m w .
Hence, the statement “(i)   (ii)” is true for h = 1 .
“Induction”
Let k be an integer such that k 1 .
Induction Hypothesis:
The statement “(i)   (ii)” is true for any parse tree with height h if h k .
Consider now a parse tree P t ( A , w , k + 1 ) that has root A , yield w and a height of k + 1 .
This parse tree looks like the following figure.
Figure 2.2. Caption.
Figure 2.2. Caption.
Preprints 161810 g002
i { 1,2 , n } , X i V Σ .
There are 2 cases to examine.
(a)
X i Σ
X i = w i for some w i Σ .
X i 0 w i .
X i * w i .
X i * , l m w i ( X i is the only variable in the head)
Furthermore, since X i Σ , X i = w i is a leaf.
Therefore, w i w .
(b)
X i V
By T11 and T13, the combination of all branches of X i forms a subtree of P t ( A , w , k + 1 ) and every leaf of the subtree is also a leaf of the parent tree.
Let w i be the yield of X i .
By definition of yield, every symbol in w i is a leaf and therefore a symbol in w .
Therefore, w i w .
Since w Σ * , w i Σ * .
Claim: w = w 1 w 2 w n .
By T12, w i is to the left of w j for i < j since X i is to the left of X j .
Therefore, w = x 0 w 1 x 1 w 2 w n x n where x 0 , x 1 , x n Σ * .
Let l be a symbol in w .
l is a leaf in P t ( A , w , k + 1 ) because w is the yield.
By T8, there is a simple directed path from A to l .
By T7, there is a simple directed path from X i to l for some i { 1,2 , n } .
Since l has no children, l must be a leaf descendant of X i .
Therefore, l is a symbol in w i because w i is the yield of the subtree with root X i .
Therefore, l is a symbol in w l is a symbol in w i for some i { 1,2 , n } .
Therefore, w w 1 w 2 w n .
Therefore, x 0 w 1 x 1 w 2 w n x n w 1 w 2 w n .
This means that x 0 = x 1 = = x n = ϵ .
Therefore, w = w 1 w 2 w n .
Now, back to the subtree with root X i and yield w i .
The height of this subtree = the length of the longest branch in the subtree
= the length of a simple directed path in the parent tree
from X i to a leaf l = the length of a simple directed path in the parent tree
from A to a leaf l minus 1 (By T7 & X i is a child of A )
the height of the parent tree minus 1 = k + 1 1 = k By induction hypothesis, X i * , l m w i .
Combining (a) & (b), we now have X i * , l m w i for all i { 1,2 , n } and w = w 1 w 2 w n .
For the parent tree P t ( A , w , k + 1 ) ,
A X 1 X 2 X 3 X n (Proposition 2.8(i))
A l m X 1 X 2 X 3 X n ( A is the only variable in the head)
Since X i * , l m w i and by Proposition 2.17,
X 1 X 2 X 3 X n * , l m w 1 w 2 w n .
Therefore, A l m X 1 X 2 X 3 X n * , l m w 1 w 2 w n .
A * , l m w 1 w 2 w n .
Since w = w 1 w 2 w n , A * , l m w .
The statement “(i)   (ii)” is true for h = k + 1 .
This completes the proof of “(i)   (ii)”.
“(ii)   (iii)”
The proof of this statement is trivial because every leftmost derivation is a derivation.
“(iii)   (i)”
Since A * w , n 1 such that A n w . (Note that n 0 because A V and w Σ * .)
The proof of this statement, “(iii)   (i)”, is by induction on n .
( n = 1 )
w 1 , w 2 w m Σ such that w = w 1 w 2 w m & A w 1 w 2 w m .
By Proposition 2.8(i), A w 1 w 2 w m .
The following is a parse tree with root A and yield w .
Figure 2.3. Caption.
Figure 2.3. Caption.
Preprints 161810 g003
Therefore, the statement is true for n = 1 .
(Induction)
Induction Hypothesis:
Let k be an integer such that k 1 .
For any n k , if A n w , then a parse tree with root A and yield w .
Now, consider n = k + 1 .
If A k + 1 w ,
u 1 , u 2 , u k ( V Σ ) * such that
A u 1 u 2 u k w .
X 1 , X 2 , X m V Σ such that u 1 = X 1 X 2 X m .
Therefore, X 1 X 2 X m u 2 u k w .
By Proposition 2.28(ii),
X i n i w i with n i k and w 1 w 2 w m = w .
By induction hypothesis, a parse tree with root X i and yield w i which looks like the following figure.
Figure 2.4. Caption.
Figure 2.4. Caption.
Preprints 161810 g004
We now can construct a parse tree, P t ( A , w , k + 1 ) as follows.
(1)
Start with a one level parse tree that has root A and yield X 1 X 2 X m that looks like the following figure.
Figure 2.5. Caption.
Figure 2.5. Caption.
Preprints 161810 g005
(2)
For each i { 1,2 , m } , if X i Σ , set X i = w i for some w i Σ .
If X i V , add the parse tree as shown in Figure 2.4 to the parse tree as shown in Figure 2.5. The resulting tree would look like the following figure.
P t ( A , w , k + 1 )
Figure 2.6. Caption.
Figure 2.6. Caption.
Preprints 161810 g006
Clearly, this tree ( P t ( A , w , k + 1 ) ) with root A is a parse tree since the one level tree and all the subtrees with root X i and yield w i are parse trees.
In addition, since w 1 w 2 w m = w , the yield of this parse tree is w .
Therefore, the statement “(iii)   (i)” is true for n = k + 1 .
This completes the proof of “(iii)   (i)” and also the proof of Theorem 2.33.

2.2. Chomsky Normal Form (CNF)

Definition 2.34. 
Let G = ( V ,   Σ ,   R ,   S ) be a C F G .
G is in Chomsky normal form if very rule of G is of the following form:
A B C where A ϵ V and B , C ϵ V \ { S }
A a where a ϵ Σ
S ϵ where S = Start Variable
Lemma 2.35. 
For every C F G   G = ( V , Σ , R , S ) , there is a C F G   G ' with no ϵ -rule ( A ϵ where A S ) or unit rule ( A B where A , B V ) such that L G = L ( G ' ) .
Proof. 
We can inductively construct a new set of rules, R ' using the following procedure:
(i)
Copy all the rules in R to R ' .
(ii)
If B S , A α B β and B ϵ are in R ' , create A α β in R ' .
(iii)
If A B and B γ are in R ' , create A γ in R ' .
We can further assume that R ' is the smallest one of all the sets that can be thus created because we can always rename the smallest one to R ' knowing that the minimum exists.
Let G ' = ( V , Σ , R ' , S ) .
It’s clear from construction that R R ' .
Therefore, every derivation in G is a derivation in G ' and hence L G L ( G ' ) .
On the other hand, every new rule that is created in G ' is equivalent to the two rules that it is created from by Proposition 2.15 and therefore, every derivation in G ' can be simulated by either the same rules or equivalent rules in G .
Hence, L G ' L ( G ) .
It remains to show that all the ϵ and unit rules in G ' are redundant for the production of any x L G ' .
Since L G ' = x Σ *     S * , G ' x } , knowing that minimum derivations exist, we can assume every derivation of x L G ' is the one of minimum length.
Claim 1. 
Any derivation S * , G ' x does not use an ϵ -rule.
Proof of Claim 1. 
Assume for contradiction that B ϵ where B S is used at some point of the derivation.
S * , G ' x can be rewritten as
S * , G ' γ B δ 1 , G ' γ δ * , G ' x where γ , δ V Σ * .
This B must have been generated at an earlier point of the derivation in the form of
η A θ 1 , G ' η α B β θ where η , α , β , θ V Σ * .
Therefore, S * , G ' x can be further rewritten as
S m , G ' η A θ 1 , G ' η α B β θ n , G ' γ B δ 1 , G ' γ δ k , G ' x where k , m , n 0 .
(Note that η α B β θ n , G ' γ B δ is a derivation in which the rule in each step does not originate from this particular B .)
Since A α B β and B ϵ are in R ' , by construction (ii), A α β is in R ' .
Therefore, η A θ 1 , G ' η α β θ is a valid production in G ' .
Furthermore, since η α B β θ n , G ' γ B δ , by Proposition 2.19, we can substitute ϵ for B to obtain the following valid production in G ' :
η α β θ n , G ' γ δ .
If we apply these two new productions at the corresponding points of the original derivation of x , we have the following valid derivation:
S m , G ' η A θ 1 , G ' η α β θ n , G ' γ δ k , G ' x .
We note that this new derivation of x has a length of k + m + n + 1 which is shorter than the original one of k + m + n + 2 .
This contradicts the assumption that the original derivation is of minimum length.
Claim 2. Any derivation S * , G ' x does not use a unit rule.
Proof of Claim 2. Assume for contradiction that a unit rule A B is used at some point of the derivation S * , G ' x .
We can rewrite this derivation as
S * , G ' α A β 1 , G ' α B β * , G ' x .
This B must be eventually gotten rid of before reaching the final product of x Σ * and the production that we need for getting rid of B is:
η B θ 1 , G ' η γ θ where B γ is a rule in G ' .
We can now rewrite S * , G ' x as
S m , G ' α A β 1 , G ' α B β n , G ' η B θ 1 , G ' η γ θ k , G ' x .
Since A B and B γ are rules in R ' , A γ is a rule in R ' by construction (iii).
α A β 1 , G ' α γ β is a valid production in G ' .
Furthermore, since α B β n , G ' η B θ , by Proposition 2.19, we can substitute γ for B to obtain the following valid production:
α γ β n , G ' η γ θ .
By applying these two new productions at the corresponding points of the derivation of x , we have the following derivation:
S m , G ' α A β 1 , G ' α γ β n , G ' η γ θ k , G ' x .
This new derivation has a length of k + m + n + 1 which is shorter than the original one of k + m + n + 2 .
This contradicts the assumption that the original given derivation of x is of minimum length.
Combining Claim 1 and Claim 2, we can conclude Lemma 2.35.
We now examine a method for converting a C F G into one in Chomsky Normal form.
Definition 2.36 (The Method ( M ) ).
From every C F G , G = ( V ,   Σ ,   R ,   S ) , that doesn’t have ϵ -rules or unit rules, we can construct a C F G , G ' = ( V ' ,   Σ ,   R ' ,   S ) using a method called Method ( M ) as described in the following steps:
Step 1
For every a Σ , create a variable A a and a rule A a a . Note that A a is a newly and uniquely created variable such that A a V and A a A b for any a , b Σ such that a b .
Step 2
r R , r can be expressed as A u 1 u 2 u k where A V , u 1 , u 2 , u k V Σ & k 0 . Create a set of rules (called P ( r ) ) and a set of nodes (called V ( r ) ) according to the following steps:
(i)
For k = 0
r becomes A ϵ .
Since R doesn’t have any ϵ -rule, except S ϵ , A must be equal to S and r becomes S ϵ .
Copy S ϵ into P ( r ) .
In this case, P r = S ϵ = { r } and V r = .
(ii)
For k = 1
r becomes A u 1 .
Since R doesn’t have any unit rule, u 1 Σ .
Copy r into P r .
In this case, P r = A u 1 = { r } and V r = .
(iii)
For k = 2
r becomes A u 1 u 2 .
If u 1 , u 2 V , copy r into P r . In this case, P r =   A u 1 u 2 = { r } and V r = .
If u 1 Σ & u 2 V , create A U u 1 u 2 and add this rule and U u 1 u 1 to P r . Add U u 1 to V r .
(Note that U u 1 u 1 was created in Step 1 above).
In this case, P r = { A U u 1 u 2 , U u 1 u 1 } and V r = { U u 1 } .
If u 1 V & u 2 Σ , create A u 1 U u 2 and add this rule and U u 2 u 2 to P r . Add U u 2 to V r .
(Note that U u 2 u 2 was created in Step 1 above).
In this case, P r = { A u 1 U u 2 , U u 2 u 2 } and V r = { U u 2 } .
If both u 1 , u 2 Σ , create A U u 1 U u 2 and add it along with U u 1 u 1 , U u 2 u 2 to P r . Add U u 1 , U u 2 to V r .
(Note that U u 1 u 1 and U u 2 u 2 were created in Step 1 above).
In this case, P r = { A U u 1 U u 2 , U u 1 u 1 , U u 2 u 2 } and V r = { U u 1 , U u 2 } .
(iv)
For k 3
Figure 2.7. Caption.
Figure 2.7. Caption.
Preprints 161810 g007
As depicted by the above figure, create the following rules and add them to P r .
A U 1 A 1
A 1 U 2 A 2
A 2 U 3 A 3
             
A i U i + 1 A i + 1
             
A k 2 U k 1 U k where A 1 , A 2 , A k 2 are variables newly and uniquely created for each r and therefore, they are not in V .
For any i { 1,2 , 3 , k } , if u i V , U i = u i and if u i Σ , set U i = U u i and add U u i u i to P r . Add U u i to V r .
(Note that U u i u i for each u i Σ were created in Step 1 above).
In this case, P r includes all the rules:
A U 1 A 1
A 1 U 2 A 2
A 2 U 3 A 3
             
A i U i + 1 A i + 1
             
A k 2 U k 1 U k and the rules U u i u i for any u i Σ whereas
V r = U u i u i Σ { A i | i = 1,2 , k 2 } Step 3
Set
V ' = V r R V r
And
R ' = r R P r
We note the following properties of the rules created by Method ( M ) :
N1. All the rules in R ' are in Chomsky Normal Form.
N2. For any r ' R ' , there exists r R such that r ' P ( r ) . Furthermore, P ( r 1 ) P ( r 2 ) for any r 1 , r 2 R such that r 1 r 2 .
N3. For any r R , r is equivalent to the rules in P ( r ) by Proposition 2.15.
N4.  V and r R V r are disjoint. That is V r R V r = .
N5. For any r ' P ( r ) , either H e a d r ' = H e a d ( r ) or H e a d r ' V . Or equivalently, H e a d r ' V H e a d r ' = H e a d ( r ) .
N6.  r ' P ( r ) , if B o d y ( r ' ) = 2 , then r ' is unique for P ( r ) . That is,
r ' P ( r 1 ) for any r 1 R such that r r 1 .
N7. If k = 0 or B o d y ( r ) = 0 , P r = S ϵ = { r } and V r = .
N8. If k = 1 or B o d y ( r ) = 1 , P r = A u 1 = { r } and V r = where u 1 Σ .
We now have the following theorem.
Theorem 2.37. Every context-free language is generated by a C F G in Chomsky normal form ( C N F ).
Proof. Since every context-free language is generated by a C F G , we need to show that every C F G can be converted to an equivalent C F G in Chomsky normal form.
Also, because of Lemma 2.35, we can start with a C F G that has no ϵ -rule ( A ϵ where A S ) or unit rule ( A B where A , B V ).
Let G = ( V , Σ , R , S ) be the C F G that has no ϵ -rule or unit rule except S ϵ .
Let G ' = ( V ' ,   Σ ,   R ' ,   S ) be a C F G constructed from G by use of Method ( M ) .
In the following, we shall show L G = L ( G ' ) by showing x L G x L ( G ' )   x Σ * .
" " (If x L G )
S * , G x .
r 1 , r 2 , r i , r n , r n + 1 R and u 1 , u 2 , u i , u n ( V Σ ) * such that
S r 1 , G u 1 r 2 , G u 2 u i 1 r i , G u i r n , G u n r n + 1 , G x .
By N3, for any i { 1,2 , , n + 1 } , r i is equivalent to a sequence of rules from P ( r i ) which is a subset of R ' .
Therefore, S P ( r 1 ) , G ' u 1 P ( r 2 ) , G ' u 2 u i 1 P ( r i ) , G ' u i P ( r n ) , G ' u n P ( r n + 1 ) , G ' x .
Note that u 1 , u 2 , u i , u n ( V ' Σ ) * because V V ' .
Therefore, S * , G ' x .
Therefore, x L ( G ' ) .
” (If x L G ' )
S * , G ' x .
By Theorem 2.33, a parse tree (in G ' ) with root S V ' and a yield x Σ * .
Let’s call this parse tree ( T ' ) .
By definition of parse tree, S and its children must be the head and body of a rule in R ' .
Let’s call this rule r ' and hence H e a d r ' = S .
By N1, r ' must be in one of the following forms:
  • S ϵ
  • A a where a Σ , A V '
  • A U 1 U 2 where A V ' , U 1 , U 2 V ' \ { S }
If r ' is S ϵ , ϵ is the only child of S .
Since ϵ has no children and x is a descendant of S , this is possible only if ϵ = x .
Furthermore, by construction of ( M ) , S ϵ in R ' is created from S ϵ in R .
Therefore, S ϵ is also a rule in R .
Therefore, S 1 , G ϵ . (Proposition 2.8(i))
Therefore, S 1 , G x .
Therefore, S * , G x .
Therefore, x L G If r ' is A a , since S = H e a d ( r ' ) , S = A .
Therefore, r ' is S a and S has only one child which is a .
Since x is a descendant of S and a has no children, a = x .
By construction of ( M ) , A a in R ' is created from A a in R .
Therefore, A a is also a rule in R .
Therefore, S x is a rule in R .
Therefore, S 1 , G x . (Proposition 2.8(i))
Therefore, S * , G x .
Therefore, x L G .
If r ' is A U 1 U 2 where A V ' , U 1 , U 2 V ' \ { S } Since H e a d r ' = S and S V , H e a d r ' V .
Since H e a d r ' = A , A = S .
Therefore, r ' becomes S U 1 U 2 where U 1 , U 2 V ' \ { S } .
By N2, r R such that r ' P ( r ) .
Let r be A ' u 1 u 2 u k where A ' V , u 1 , u 2 , u k V Σ .
By N5, H e a d r ' V H e a d r ' = H e a d ( r ) .
Therefore, S = A ' .
Therefore, r becomes S u 1 u 2 u k .
We now analyze the different situations for different values of k .
If k = 0 , r becomes S ϵ .
By construction of ( M ) , P r = { S ϵ } .
Since r ' P ( r ) , r ' is S ϵ .
This contradicts the underlying assumption that r ' is A U 1 U 2 where A V ' , U 1 , U 2 V ' \ { S } .
Therefore, k cannot be 0 .
If k = 1 , r becomes S u 1 .
Since R doesn’t have any unit rule, u 1 Σ .
By construction of ( M ) , P r = { S u 1 } .
Therefore, r ' is S u 1 where u 1 Σ .
This contradicts the underlying assumption that r ' is A U 1 U 2 where A V ' , U 1 , U 2 V ' \ { S } .
Therefore, k cannot be 1 .
Therefore, we can exclude the cases of k { 0,1 } under the assumption that r ' is A U 1 U 2 where A V ' , U 1 , U 2 V ' \ { S } .
If k = 2 , r becomes S u 1 u 2 .
By construction of ( M ) , P r is one of the following:
(i)
P r = { S u 1 u 2 } if u 1 , u 2 V
(ii)
P r = { S U u 1 u 2 , U u 1 u 1 } if u 1 Σ &   u 2 V
(iii)
P r = { S u 1 U u 2 , U u 2 u 2 } if u 1 V &   u 2 Σ
(iv)
P r = { S U u 1 U u 2 , U u 1 u 1 , U u 2 u 2 } if u 1 , u 2 Σ
For (i), r ' is S u 1 u 2 .
In this case, r and r ' are the same and the sub parse tree in ( T ' ) with root S and children u 1 , u 2 as shown on the right of the following figure can be replaced by a parse tree in G with the same root and children as shown on the left.
Figure 2.8. Caption.
Figure 2.8. Caption.
Preprints 161810 g008
For (ii), r ' is either S U u 1 u 2 or U u 1 u 1 .
However, since H e a d r ' = S which is in V and U u 1 V , r ' cannot be U u 1 u 1 .
r ' must be S U u 1 u 2 .
By N3, S u 1 u 2 is equivalent to S U u 1 u 2 and U u 1 u 1 .
We have the following equivalent parse trees with the same root and yield.
Figure 2.9. Caption.
Figure 2.9. Caption.
Preprints 161810 g009
The one on the left is a parse tree in G whose root and its children are the head and body of a rule in R whereas the one on the right is a sub parse tree of ( T ' ) .
Therefore we can replace a sub parse tree of ( T ' ) with an equivalent parse tree in G whose root and yield are the head and body of a rule in R .
For (iii), by a similar argument, we have the following equivalent parse trees with the same root and yield.
Figure 2.10. Caption.
Figure 2.10. Caption.
Preprints 161810 g010
The one on the left is a parse tree in G whose root and its children are the head and body of a rule in R whereas the one on the right is a sub parse tree of ( T ' ) .
Therefore we can replace a sub parse tree of ( T ' ) with an equivalent parse tree in G whose root and yield are the head and body of a rule in R .
For (iv), by a similar argument, we have the following equivalent parse trees with the same root and yield.
Figure 2.11. Caption.
Figure 2.11. Caption.
Preprints 161810 g011
The one on the left is a parse tree in G whose root and its children are the head and body of a rule in R whereas the one on the right is a sub parse tree of ( T ' ) .
Therefore we can replace a sub parse tree of ( T ' ) with an equivalent parse tree in G whose root and yield are the head and body of a rule in R .
If k 3 , r is S u 1 u 2 u k .
P ( r ) consists of the following rules:
S U 1 A 1
A 1 U 2 A 2
A 2 U 3 A 3
                   
A i U i + 1 A i + 1
                   
A k 2 U k 1 U k
U u i u i if u i Σ   i { 1,2 , 3 , k } where U i = u i if u i V and U i = U u i if u i Σ .
Since H e a d r ' = S , r ' is S U 1 A 1 .
Since ( T ' ) is a parse tree of G ' , by definition of parse tree,
U 1 A 1 are children of S .
U 2 A 2 are children of A 1 .
U 3 A 3 are children of A 2 .
             
U k 1 U k are children of A k 2 .
By N3, r is equivalent to the sequence of rules contained in P ( r ) .
Therefore, we have the following equivalent parse trees with the same root and yield.
Figure 2.12. Caption.
Figure 2.12. Caption.
Preprints 161810 g012
The one on the left is a parse tree in G whose root and its children are the head and body of a rule in R whereas the one on the right is a sub parse tree of ( T ' ) .
Therefore we can replace a sub parse tree of ( T ' ) with an equivalent parse tree in G whose root and yield are the head and body of a rule in R .
Combining all cases, we conclude that there is a sub parse tree in ( T ' ) with root S that can be replaced by an equivalent parse tree in G whose root and yield are the head and body of a rule in R .
We can write this rule in R as S u 1 u 2 u k where k 0 and u i V Σ for i { 1,2 , k }
(a) If all u i ’s are terminals
In this case, u 1 u 2 u k = x , the yield of the parent tree ( T ' ) .
The reason is that a leaf of a subtree is also a leaf of the parent tree.
Therefore, u i x i { 1,2 , k } .
On the other hand, if l is a leaf in x , there is a simple directed path from S to l . This simple directed path must pass through one of the nodes u 1 , u 2 u k because u 1 u 2 u k is the yield of a sub parse tree in ( T ' ) which is obtained by branching out from S in all possible directions.
Therefore, l   must be one of the nodes u 1 , u 2 u k .
After replacement, we now have a new tree which is a parse tree in G , and furthermore, the root and yield of this tree are respectively S and x .
By Theorem 2.33, S * , G x .
Therefore, x L ( G ) .
(b) If some u i ’s are variables
For each u i that is a variable, we can repeat the above replacement process to replace the sub parse tree (with root u i ) in ( T ' ) with a parse tree in G whose root ( u i ) and the root’s children are the head and body of a rule in R .
Since every time we do a replacement, we get down to a lower level of ( T ' ) and since the height of ( T ' ) and the number of subtrees of ( T ' ) are both finite, this process of replacement must come to a stop after a finite number of operations. When this happens, we have a new tree in which every interior node and its children are the head and body of a rule in R . This means that the new tree thus created is a parse tree in G .
Furthermore, this replacement process only affects the nodes which are variables. Therefore, the yield of ( T ' ) , namely x , is untouched and remains at the bottom after the replacement is complete.
This means that x is also the yield of the newly created tree.
We now have a new tree with root S and yield x and the tree is also a parse tree in G .
By Theorem 2.33, S * , G x .
Therefore, x L ( G ) .
Combining (a) and (b), we complete the proof of Theorem 2.37.
On the basis of Theorem 2.37 and the results proved in Lemma 2.35, we can now develop a set of operational rules for the conversion of a C F G to one in C N F .
Let G = ( V , Σ , R , S ) be the C F G to be converted.
Let G ' = ( V ' , Σ , R ' , S 0 ) be the C F G to be created in C N F .
C R 1 .
Create S 0 S and add it to R ' .
(Note that this creation will ensure that the start variable will not occur on the right hand side of a rule.)
C R 2 .
(Elimination of ϵ -rules)
If a rule B ϵ in R , do the following:
(i)
For every rule in R in the form A u 1 B u 2 B u 3 B u 4 u n 1 B u n B u n + 1
(1)
For each single occurrence of B , on the R H S , create a rule with that occurrence deleted and add it to R ' .
For example, A u 1 u 2 B u 3 B u 4 u n 1 B u n B u n + 1
A u 1 B u 2 u 3 B u 4 u n 1 B u n B u n + 1
A u 1 B u 2 B u 3 B u 4 u n 1 B u n u n + 1
(2)
For each group occurrence of 2   B s on the R H S , create a rule with that group occurrence deleted and add it to R ' .
For example, A u 1 u 2 u 3 B u 4 u n 1 B u n B u n + 1
A u 1 u 2 B u 3 u 4 u n 1 B u n B u n + 1
A u 1 B u 2 B u 3 B u 4 u n 1 u n u n + 1 .
(n) For each group occurrence of n   B s on the R H S , create a rule with that group occurrence deleted and add it to R ' .
For example, A u 1 u 2 u 3 u 4 u n 1 u n u n + 1 .
(ii)
Repeat (i) until all rules of the form of B ϵ are eliminated.
C R 3 . (Elimination of unit rules)
If rules A B and B u in R , do the following:
(i)
Create A u and add it to R ' .
(ii)
Copy B u to R ' .
(iii)
Do not copy A B to R ' .
(iv)
Repeat (i) and (ii) until all unit rules of the form A B are eliminated.
C R 4 . (Conversion of remaining rules)
For every remaining rule A in R , A u 1 u 2 u k where each u i V Σ for i { 1,2 , k } .
Create in R ' the following sequence of rules and add the corresponding created variables to V ' :
A U 1 A 1
A 1 U 2 A 2
A 2 U 3 A 3
             
A k 2 U k 1 A k 1
A k 1 U k
where U i = u i if u i V and if u i Σ , add U i u i .
Example 2.38. Let G = ( V , Σ , R , S ) be the C F G consisting of the following rules:
S A S A | a B
A B | S
    B b | ϵ
   Convert G to G ' = ( V ' , Σ , R ' , S 0 ) in C N F .
   Step 1. (Applying C R 1 .)
    S 0 S
S A S A | a B
A B | S
    B b | ϵ
   Step 2. (Removing B ϵ using C R 2 )
    S 0 S
S A S A | a B | a
    A B | S | ϵ
    B b
   Step 3 (Removing A ϵ using C R 2 )
    S 0 S
    S A S A | a B | a | S A | A S | S
    A B | S
    B b
   Step 4 (Removing S S because of redundancy)
S 0 S
    S A S A | a B | a | S A | A S
    A B | S
    B b
   Step 5 (Removing S 0 S using C R 3 )
    S 0 A S A | a B | a | S A | A S
    S A S A | a B | a | S A | A S
    A B | S
    B b
   Step 6 (Removing A B using C R 3 )
    S 0 A S A | a B | a | S A | A S
    S A S A | a B | a | S A | A S
    A b | S   B b Step 7 (Removing A S using C R 3 )
    S 0 A S A | a B | a | S A | A S
    S A S A | a B | a | S A | A S
    A b | A S A | a B | a | S A | A S
    B b
   Step 8 (Conversion of remaining rules into C N F )
   Since S 0 A S A S 0 A A 1 A 1 S A and S 0 a B S 0 U B U a ,
    S A S A S A A 1 A 1 S A and S a B S U B U a , and
    A A S A A A A 1 A 1 S A and A a B A U B U a ,
   the rules in R ' now become
    S 0 A A 1 | U B | a | S A | A S
    S A A 1 | U B | a | S A | A S
    A b | A A 1 | U B | a | S A | A S
    B b
A 1 S A
U a
Example 2.39. 
Convert S a S b | ϵ to C N F where S V and a , b Σ and show that there is more than one way of deriving the string a 2 b 2 using rules in C N F .
Conversion of rules.
S a S b | ϵ
S a S b | a b
S A S B | A B A a                               B b                              
S A C | A B ; C S B A a ; B b                              
Derivation of a2b2
There is more than one way of deriving the string a 2 b 2 . Below are a few examples.
(i)
S S A C A C C S B A S B A a a S B B b a S b S A B a A B b A a a a B b B b a a b b .
(ii)
S S A C A C C S B A S B S A B A A B B A a a A B B A a a a B B B b a a b B B b a a b b .
(iii)
S S A C A C C S B A S B S A B A A B B B b A A B b B b A A b b A a A a b b A a a a b b .

2.3. Pushdown Automata ( P D A )

Pushdown automata is another kind of nondeterministic computation model similar to nondeterministic finite automata except that they have an extra component called stack. The purpose of the stack is to provide additional memory beyond what is available in finite automata.
Pushdown automata are equivalent in power to context-free grammars which will be proved later. In addition to reading symbols from the input alphabet Σ , a P D A also reads and writes symbols on the stack. Writing and reading on the stack must be done at the top. Either symbol from input or stack can be ϵ thereby allowing the machine to move without actually reading or writing. Upon reading a symbol from the input alphabet, the P D A decides to make one of the following moves on the stack before entering the next state:
(i) Replace
Replace the symbol at the top of the stack with another symbol. This move is referred to as the “Replace” move.
(ii) Push
Add a symbol to the top of the stack. This move is referred to as the “Push” move.
(iii) Pop
Erase or remove a symbol from the top of the stack. This move is referred to as the “Pop” move.
(iv) Untouched
Do nothing to change the stack. This move is referred to as “Untouched” move.
A P D A is formally defined as follows.
Definition 2.40. 
A P D A is a 7-tuple, M = ( Q , Σ , Γ , δ , q 0 , , F ) where Q , Σ , Γ &   F are finite sets such that
(a) Q is the set of states
(b) Σ is the input alphabet
(c) Γ is the stack alphabet
(d) δ : Q × Σ ϵ × Γ ϵ ( Q × Γ ϵ ) is the transition function
(e) q 0 Q is the start state
(f) Γ is the initial stack symbol signaling an empty stack
(g) F Q is the set of accept states.
M computes as follows.
Let w = w 1 w 2 w m where w i Σ ϵ for 1 i m .
M accepts w iff r 0 ,   r 1 r m Q and s 0 ,   s 1 s m Γ * such that the following conditions are satisfied:
(i) r 0 = q 0 and s 0 =
(ii) r i + 1 , b i δ r i , w i + 1 , a i ) for 0 i m 1 where a i , b i Γ ϵ and s i = a i t i , s i + 1 = b i t i where t i Γ *
(iii) r m F
When m = 0 ,   w = ϵ and only conditions (i) and (iii) are valid which then becomes r 0 = q 0 and s 0 = and r 0 F .
Therefore, we define a P D A to accept ϵ whenever the start state is also an accept state and the stack is signaled to be empty.
If we write r i w i + 1 , a i b i , δ r i + 1 for r i + 1 , b i δ r i , w i + 1 , a i , conditions (i), (ii) and (iii) can be written as follows:
q 0 = r 0 w 1 , a 0 b 0 , δ r 1 w 2 , a 1 b 1 , δ r 2 r i w i + 1 , a i b i , δ r i + 1 r m 1 w m , a m 1 b m 1 , δ r m , r m F .
When there is only one transition function under consideration, the showing of δ in the computation is usually omitted and the following shorthand is used instead:
q 0 = r 0 w 1 , a 0 b 0 r 1 w 2 , a 1 b 1 r 2 r i w i + 1 , a i b i r i + 1 r m 1 w m , a m 1 b m 1 r m , r m F .
For simplicity, we sometimes can use the notation q 0 w , * , δ r m to represent a computation of w from q 0 to r m without showing the intermediate states.
We now can use the transition function to describe the four basic moves of the P D A as mentioned above:
(i) Replace
r a , b c r ' signifies a replacement of b by c at the top of the stack upon reading symbol a from input.
(ii) Push
r a , ϵ c r ' signifies adding the symbol c to the top of the stack upon reading symbol a from input.
(iii) Pop
r a , b ϵ r ' signifies removing the symbol b from the top of the stack upon reading symbol a from input.
(iv) Untouched
r a , ϵ ϵ r ' signifies nothing is done to change the stack upon reading symbol a from input.
We further note that when a = ϵ , r ϵ , ϵ ϵ r ' signifies a change of state from r to r ' with no input read and no change made to the stack.
Example 2.41. 
Let M = ( Q , Σ , Γ , δ , q 1 , , F ) be a P D A where
Q = { q 1 , q 2 , q 3 , q 4 } , Σ = { 0,1 } , Γ = { 0 , , $ } , F = { q 1 , q 4 } with the following state diagram:
Preprints 161810 i001
M recognizes the language { 0 n 1 n |   n 0 } .
If the stack is signaled to be empty at the beginning, M accepts the empty string ( ϵ = 0 0 1 0 ), because q 1 is both a start and accept state. Furthermore, if the input string is not empty at the start state, the P D A would not read anything from the string except to push $ onto the stack.
M accepts the string 0 3 1 3 with the following computation:
q 1 ϵ , ϵ $ q 2 0 , ϵ 0 q 2 0 , ϵ 0 q 2 0 , ϵ 0 q 2 1,0 ϵ q 3 1,0 ϵ q 3 1,0 ϵ q 3 ϵ , $ ϵ q 4 , q 4 F .
Note that the above illustration is not a proof that M recognizes the language
{ 0 n 1 n |   n 0 } . To make such a proof, one must argue that every string of the form 0 n 1 n is accepted by M and every string accepted by M is of the form 0 n 1 n .
Note also that the steps q 1 ϵ , ϵ $ q 2 and q 3 ϵ , $ ϵ q 4 can be replaced by q 1 ϵ , ϵ ϵ q 2 and q 3 ϵ , ϵ ϵ q 4 to transition to another state without making a change to the stack.
Example 2.42. 
Let M = ( Q , Σ , Γ , δ , q 1 , , F ) be a P D A where
Q = { q 1 , q 2 , q 3 , q 4 , q 5 , q 6 , q 7 } , Σ = { a , b , c } , Γ = { a , , $ } , F = { q 4 , q 7 } with the following state diagram:
Preprints 161810 i002
M recognizes the language a i b j c k i , j , k 0 a n d i = j o r i = k } .
M accepts the empty string ( ϵ = a 0 b 0 c 0 ) with the following computation:
q 1 ϵ , ϵ $ q 2 ϵ , ϵ ϵ q 3 ϵ , $ ϵ q 4 , q 4 F .
M accepts the string a 2 b 2 c 3 with the following computation:
q 1 ϵ , ϵ $ q 2 a , ϵ a q 2 a , ϵ a q 2 ϵ , ϵ ϵ q 3 b , a ϵ q 3 b , a ϵ q 3 ϵ , $ ϵ q 4 c , ϵ ϵ q 4 c , ϵ ϵ q 4 c , ϵ ϵ q 4 , q 4 F .
M accepts the string a 2 b 3 c 2 with the following computation:
q 1 ϵ , ϵ $ q 2 a , ϵ a q 2 a , ϵ a q 2 ϵ , ϵ ϵ q 5 b , ϵ ϵ q 5 b , ϵ ϵ q 5 b , ϵ ϵ q 5 ϵ , ϵ ϵ q 6 c , a ϵ q 6 c , a ϵ q 6 ϵ , $ ϵ q 7 , q 7 F .
Note also that the computations q 1 ϵ , ϵ $ q 2 , q 3 ϵ , $ ϵ q 4 , and q 6 ϵ , $ ϵ q 7 can be replaced by q 1 ϵ , ϵ ϵ q 2 and q 3 ϵ , ϵ ϵ q 4 and q 6 ϵ , ϵ ϵ q 7 to transition to another state without making a change to the stack.
Example 2.43. 
Let M = ( Q , Σ , Γ , δ , q 1 , , F ) be a P D A where
Q = { q 1 , q 2 , q 3 , q 4 } , Σ = { 0,1 } , Γ = { 0,1 , , $ } , F = { q 1 , q 4 } with the following state diagram:
Preprints 161810 i003
M recognizes the language { w w R | w 0,1 * } .
If the stack is signaled to be empty at the beginning, M accepts the empty string
( ϵ = ϵ ϵ R ), because q 1 is both a start and accept state.
M accepts the string 001100 with the following computation:
q 1 ϵ , ϵ $ q 2 0 , ϵ 0 q 2 0 , ϵ 0 q 2 1 , ϵ 1 q 2 ϵ , ϵ ϵ q 3 1,1 ϵ q 3 0,0 ϵ q 3 0,0 ϵ q 3 ϵ , $ ϵ q 4 , q 4 F .
Note also that the steps q 1 ϵ , ϵ $ q 2 and q 3 ϵ , $ ϵ q 4 can be replaced by q 1 ϵ , ϵ ϵ q 2 and q 3 ϵ , ϵ ϵ q 4 to transition to another state without making a change to the stack.
Instead of writing symbols one at a time to the stack, we can actually design P D A s which can write a string of symbols to the stack in one step. These P D A s are called extended P D A s. It turns out that the two kinds of P D A s are equivalent in power in that given one, we can construct the other such that the two recognize the same language. The equivalence of these two kinds of P D A s will be proved later.
Definition 2.44. 
An extended P D A is a 7-tuple, M E = ( Q , Σ , Γ , δ ^ , q 0 , , F ) where Q , Σ , Γ &   F are finite sets such that
(a) Q is the set of states
(b) Σ is the input alphabet
(c) Γ is the stack alphabet
(d) δ ^ : Q × Σ ϵ × Γ ϵ ( Q × Γ * ) is the transition function
(e) q 0 Q is the start state
(f) Γ is the initial stack symbol signaling an empty stack
(g) F Q is the set of accept states.
M computes as follows.
Let w = w 1 w 2 w m where w i Σ ϵ for 1 i m .
M accepts w iff r 0 ,   r 1 r m Q and s 0 ,   s 1 s m Γ * such that the following conditions are satisfied:
(1) r 0 = q 0 and s 0 =
(2) r i + 1 , b i δ ^ r i , w i + 1 , a i ) for 0 i m 1 where a i Γ ϵ , b i Γ * and s i = a i t i , s i + 1 = b i t i where t i Γ *
(3) r m F
When m = 0 ,   w = ϵ and only conditions (i) and (iii) are valid which then becomes r 0 = q 0 and s 0 = and r 0 F .
Therefore, we define the extended P D A to accept ϵ whenever the start state is also an accept state and the stack is signaled to be empty.
If we write r i w i + 1 , a i b i , δ ^ r i + 1 for r i + 1 , b i δ ^ r i , w i + 1 , a i , conditions (i), (ii) and (iii) can be written as follows:
q 0 = r 0 w 1 , a 0 b 0 , δ ^ r 1 w 2 , a 1 b 1 , δ ^ r 2 r i w i + 1 , a i b i , δ ^ r i + 1 r m 1 w m , a m 1 b m 1 , δ ^ r m , r m F .
When there is only one transition function under consideration, the showing of δ ^ in the computation is usually omitted and the following shorthand is used instead:
q 0 = r 0 w 1 , a 0 b 0 r 1 w 2 , a 1 b 1 r 2 r i w i + 1 , a i b i r i + 1 r m 1 w m , a m 1 b m 1 r m , r m F .
For simplicity, we sometimes can use the notation q 0 w , * , δ ^ r m to represent a computation of w from q 0 to r m without showing the intermediate states.
Theorem 2.45. 
For any extended P D A , ( M E ), there is a P D A   ( M ), such that L M E = L ( M ) and vice versa.
Proof. 
Construction of M from M E .
Let M E = ( Q E , Σ , Γ , δ ^ , q 0 , , F ) be an extended P D A .
Construct P D A , M = ( Q , Σ , Γ , δ , q 0 , , F ) where Q and δ are to be defined as follows.
For every ( q , a , s ) Q E × Σ ϵ × Γ ϵ , we define δ ( q , a , s ) as follows.
If δ ^ q , a , s = , δ q , a , s = .
If δ ^ q , a , s , at least one ( r , u ) δ ^ q , a , s .
Let δ 1 q , a , s = { ( r , ϵ ) | ( r , ϵ ) δ ^ q , a , s } .
( r , u ) δ ^ q , a , s where ( r , u ) Q E × Γ * and u ϵ , u 1 , u 2 u l Γ , l 1 such that u = u 1 u 2 u l .
(Note that none of u 1 , u 2 u l is ϵ .)
Create new states q 1 , q 2 , q l 1 that satisfy the following conditions:
q a , s u l , δ q 1 (by making ( q 1 , u l ) δ q , a , s )
q 1 ϵ , ϵ u l 1 , δ q 2 (by making δ q 1 , ϵ , ϵ = { q 2 , u l 1 } )
q 2 ϵ , ϵ u l 2 , δ q 3 (by making δ q 2 , ϵ , ϵ = { q 3 , u l 2 } )
             
q l 1 ϵ , ϵ u 1 , δ r (by making δ q l 1 , ϵ , ϵ = { r , u 1 } )
Note that the states q 1 , q 2 , q l 1 thus created are not in Q E and that
δ q i , a , s = for any other combinations of a , s ϵ , ϵ and i { 1,2 , l 1 } .
Note also that there can be more than one set of states q 1 , q 2 , q l 1 and stack symbols u 1 , u 2 u l to be created from each combination of q , a , s because there can be more than one ( r , u ) δ ^ q , a , s based on which the states and the stack symbols are created.
Let δ 2 q , a , s = ( r , u ) δ ^ q , a , s { q 1 , u l } where u = u 1 u 2 u l , l 1 , u i Γ and
q 1 is created from (ii) above.
Let δ q , a , s = δ 1 q , a , s δ 2 q , a , s .
For each q , a , s , r , u Q E × Σ ϵ × Γ ϵ × Q E × Γ * , where ( r , u ) δ ^ q , a , s and u ϵ , define
Q q , a , s , r , u = { q i | 1 i l 1 ; l   &   q i   a r e   c r e a t e d   f r o m   i i ; ( r , u ) δ ^ q , a , s ; u ϵ } (Note that q i Q q , a , s , r , u q i Q E .)
Let P q , a , s = r , u δ ^ q , a , s ; u ϵ Q q , a , s , r , u Set Q = Q E ( q , a , s ) Q E × Σ ϵ × Γ ϵ P q , a , s .
So, M = Q E ( q , a , s ) Q E × Σ ϵ × Γ ϵ P q , a , s , Σ , Γ , δ , q 0 , , F .
The construction is now complete and it remains to show that L M E = L ( M ) .
Suppose w L ( M ) .
w 1 , w 2 w n Σ ϵ such that w = w 1 , w 2 w n where n 1 .
r 0 , r 1 r n Q , a i , b i Γ ϵ such that
q 0 = r 0 w 1 , a 0 b 0 , δ r 1 w 2 , a 1 b 1 , δ r 2 r i w i + 1 , a i b i , δ r i + 1 r n 1 w n , a n 1 b n 1 , δ r n , r n F .
Claim: 
0 i n 1 ,
if r i Q E , then
j and u Γ * such that i < j n , r i w i + 1 , a i u , δ ^ r j and w k = ϵ for i + 2 k j .
Proof of Claim. 
From r i w i + 1 , a i b i , δ r i + 1 in the given computation, it follows that ( r i + 1 , b i , ) δ r i , w i + 1 , a i .
By assumption, r i Q E .
By construction (ii), δ r i , w i + 1 , a i = δ 1 r i , w i + 1 , a i δ 2 r i , w i + 1 , a i .
Either ( r i + 1 , b i , ) δ 1 r i , w i + 1 , a i or ( r i + 1 , b i , ) δ 2 r i , w i + 1 , a i .
If ( r i + 1 , b i , ) δ 1 r i , w i + 1 , a i Since δ 1 r i , w i + 1 , a i = { ( r , ϵ ) | ( r , ϵ ) δ ^ r i , w i + 1 , a i } , b i , = ϵ and
( r i + 1 , ϵ ) δ ^ r i , w i + 1 , a i .
Therefore, r i w i + 1 , a i ϵ , δ ^ r i + 1 .
Since i < i + 1 n and ϵ Γ * , Claim is proved by taking j = i + 1 and u = ϵ .
If ( r i + 1 , b i , ) δ 2 r i , w i + 1 , a i Since δ 2 r i , w i + 1 , a i = ( r , u ) δ ^ r i , w i + 1 , a i { q 1 , u l } , r i + 1 , b i , = ( q 1 , u l ) for some ( r , u ) δ ^ r i , w i + 1 , a i where u = u 1 u 2 u l , l 1 , u i Γ and q 1 is created from construction (ii) above.
Therefore r i + 1 = q 1 and b i , = u l .
r i w i + 1 , a i b i , δ r i + 1 now becomes r i w i + 1 , a i u l , δ r i + 1 .
Furthermore, from r i + 1 w i + 2 , a i + 1 b i + 1 , δ r i + 2 in the given computation, we have
r i + 2 , b i + 1 δ r i + 1 , w i + 2 , a i + 1 = δ q 1 , w i + 2 , a i + 1 .
Since δ q 1 , a , s = for all ( a , s ) ( ϵ , ϵ ), we must have w i + 2 = a i + 1 = ϵ .
Therefore, r i + 2 , b i + 1 δ q 1 , ϵ , ϵ = { q 2 , u l 1 } .
Therefore, r i + 2 = q 2 and b i + 1 = u l 1 .
r i + 1 w i + 2 , a i + 1 b i + 1 , δ r i + 2 becomes r i + 1 ϵ , ϵ u l 1 , δ r i + 2 .
By repeating the above argument, we can obtain the following computation:
r i w i + 1 , a i u l , δ r i + 1 ϵ , ϵ u l 1 , δ r i + 2 ϵ , ϵ u l 2 , δ r i + 3 r i + l 1 ϵ , ϵ u 1 , δ r i + l .
where r i + 1 = q 1 , r i + 2 = q 2 ,   r i + l 1 = q l 1 , r i + l = r and w i + 2 = w i + 3 = w i + l = ϵ .
Let j = i + l .
r j = r i + 1 = r and r Q E r j Q E .
Also, w i + 2 = w i + 3 = w j = ϵ .
r , u = ( r j , u ) .
Since ( r , u ) δ ^ r i , w i + 1 , a i , ( r j , u ) δ ^ r i , w i + 1 , a i .
Therefore, r i w i + 1 , a i u , δ ^ r j .
l 1 i + l > i j > i .
Assume for contradiction that j > n .
i < n j 1 .
i < n i + l 1 .
Therefore r n r i + 1 , r i + 2 , r i + 3 r i + l 1 = { q 1 , q 2 , q 3 , q l 1 } .
This implies r n Q E , which is a contradiction because r n F and F Q E .
Therefore, i < j n .
Claim is also true under condition (b).
Combining (a) and (b), we conclude the proof of Claim.
Since r 0 = q 0 Q E , we can apply Claim on r 0 to obtain j 0 such that
0 < j 0 n ; w 2 = w 3 = = w j 0 = ϵ ; r 0 w 1 , a 0 u 0 , δ ^ r j 0 with r j 0 also in Q E and u 0 Γ * .
Since r j 0 Q E , we can again apply Claim on r j 0 to get r j 1 such that
0 < j 0 < j 1 n ; w j 0 + 2 = w j 0 + 3 = w j 1 = ϵ ; r j 0 w j 0 + 1 , a j 0 u j 0 , δ ^ r j 1 with r j 1 also in Q E & u j 0 Γ * .
By repeating this process a number of times, we will obtain 0 < j 0 < j 1 < < j m 1 < j m n such that
r 0 w 1 , a 0 u 0 , δ ^ r j 0 w j 0 + 1 , a j 0 u j 0 , δ ^ r j 1 r j m 1 w j m 1 + 1 , a j m 1 u j m 1 , δ ^ r j m , where u 0 , u j 0 u j m 1 Γ * .
Since n is finite, this process of creation must stop at some point and at this point, j m = n .
Therefore, M E accepts w 1 w j 0 + 1 w j 1 + 1 w j m 1 + 1 .
By Claim, we have
w 2 = w 3 = = w j 0 = ϵ
w j 0 + 2 = w j 0 + 3 = w j 1 = ϵ
             
w j m 1 + 2 = w j m 1 + 3 = w j m = ϵ where j m = n .
Therefore, w 1 w j 0 + 1 w j 1 + 1 w j m 1 + 1 = w 1 w 2 w n = w .
Therefore, M E accepts w 1 , w 2 w n = w .
Therefore, w L ( M E ) and hence L ( M ) L ( M E ) .
Conversely, assume w L ( M E ) .
r 0 , r 1 r n Q E ;   a i Γ ϵ , b i Γ * for 0 i n 1 ; w 1 , w 2 w n Σ ϵ such that
w = w 1 w 2 w n and
q 0 = r 0 w 1 , a 0 b 0 , δ ^ r 1 w 2 , a 1 b 1 , δ ^ r 2 r i w i + 1 , a i b i , δ ^ r i + 1 r n 1 w n , a n 1 b n 1 , δ ^ r n , r n F .
Since Q E Q , r 0 , r 1 r n Q .
For all 0 i n 1 , ( r i + 1 , b i ) δ ^ r i , w i + 1 , a i δ ^ r i , w i + 1 , a i .
[If b i = ϵ ]
( r i + 1 , ϵ ) δ ^ r i , w i + 1 , a i By construction (ii), δ 1 r i , w i + 1 , a i = { ( r , ϵ ) | ( r , ϵ ) δ ^ r i , w i + 1 , a i } .
Therefore, ( r i + 1 , ϵ ) δ 1 r i , w i + 1 , a i .
Also by construction (ii), δ r i , w i + 1 , a i = δ 1 r i , w i + 1 , a i δ 2 r i , w i + 1 , a i .
Therefore, ( r i + 1 , ϵ ) δ r i , w i + 1 , a i .
Therefore, r i w i + 1 , a i ϵ , δ r i + 1 .
Therefore, r i w i + 1 , * , δ r i + 1 .
[If b i ϵ ]
b i ( 1 ) , b i ( 2 ) b i ( l ) Γ , l 1 such that b i = b i ( 1 ) b i ( 2 ) b i ( l ) .
By construction (ii), q 1 , q 2 , q l 1 Q such that
r i w i + 1 , a i b i ( l ) , δ q 1
q 1 ϵ , ϵ b i ( l 1 ) , δ q 2
q 2 ϵ , ϵ b i ( l 2 ) , δ q 3
             
q l 1 ϵ , ϵ b i ( 1 ) , δ r i + 1 .
Therefore, r i w i + 1 , * , δ r i + 1 .
Combining both cases of [ b i = ϵ ] and [ b i ϵ ], we have
r i w i + 1 , * , δ r i + 1 for all 0 i n 1 .
Therefore,
q 0 = r 0 w 1 , * , δ r 1 w 2 , * , δ r 2 r i w i + 1 , * δ r i + 1 r n 1 w n , * δ r n , r n F .
Therefore, M accepts w 1 w 2 w n = w .
w L ( M ) .
L ( M E ) L ( M ) .
This completes the proof of L M E = L ( M ) for the construction of M from M E .
Construction of M E from M .
Let M = ( Q , Σ , Γ , δ , q 0 , , F ) be a P D A .
Construct M E = ( Q , Σ , Γ , δ ^ , q 0 , , F ) where
δ ^ : Q × Σ ϵ × Γ ϵ ( Q × Γ * ) such that
q , a , s Q × Σ ϵ × Γ ϵ , δ ^ q , a , s = δ ( q , a , s ) .
(Note that this is possible because Γ ϵ Γ * .)
It remains to show that L M E = L ( M ) .
Let w = w 1 w 2 w n where w i Σ ϵ for 1 i n & n 1 .
Suppose w L ( M ) .
r 0 , r 1 r n Q , a i Γ ϵ , b i Γ ϵ for 0 i n 1 such that
q 0 = r 0 w 1 , a 0 b 0 , δ r 1 w 2 , a 1 b 1 , δ r 2 r i w i + 1 , a i b i , δ r i + 1 r n 1 w n , a n 1 b n 1 , δ r n , r n F .
For 0 i n 1 ,
since r i w i + 1 , a i b i , δ r i + 1 , ( r i + 1 , b i ) δ r i , w i + 1 , a i .
since δ ^ r i , w i + 1 , a i = δ ( r i , w i + 1 , a i ) , ( r i + 1 , b i ) δ ^ r i , w i + 1 , a i .
Therefore, r i w i + 1 , a i b i , δ ^ r i + 1 for 0 i n 1 .
Therefore,
q 0 = r 0 w 1 , a 0 b 0 , δ ^ r 1 w 2 , a 1 b 1 , δ ^ r 2 r i w i + 1 , a i b i , δ ^ r i + 1 r n 1 w n , a n 1 b n 1 , δ ^ r n , r n F .
M E accepts w .
w L ( M E ) .
L ( M ) L ( M E ) .
Conversely, suppose w L ( M E ) .
q 0 = r 0 w 1 , a 0 b 0 , δ ^ r 1 w 2 , a 1 b 1 , δ ^ r 2 r i w i + 1 , a i b i , δ ^ r i + 1 r n 1 w n , a n 1 b n 1 , δ ^ r n , r n F ,
where a i Γ ϵ , b i Γ * for 0 i n 1 .
For 0 i n 1 , r i w i + 1 , a i b i , δ ^ r i + 1 ( r i + 1 , b i ) δ ^ r i , w i + 1 , a i .
Since δ ^ r i , w i + 1 , a i = δ ( r i , w i + 1 , a i ) , ( r i + 1 , b i ) δ r i , w i + 1 , a i .
Therefore, r i w i + 1 , a i b i , δ r i + 1 for 0 i n 1 .
Therefore,
q 0 = r 0 w 1 , a 0 b 0 , δ r 1 w 2 , a 1 b 1 , δ r 2 r i w i + 1 , a i b i , δ r i + 1 r n 1 w n , a n 1 b n 1 , δ r n , r n F .
Therefore M accepts w and hence w L ( M ) .
Therefore, L ( M E ) L ( M ) .
This completes the proof of L M E = L ( M ) for the construction of M E from M .
Combining (A) and (B), we conclude the proof of Theorem 2.45.
Now that we have proved the equivalence of P D A and extended P D A , we shall no longer distinguish between M and M E or between δ and δ ^ . From here on, we shall be using extended P D A exclusively because it is a much more convenient tool for solving problems. We shall be using M and δ for all P D A s with the understanding that the P D A s that we are dealing with can write a string to the stack in one single step.
Definition 2.46 (Configurations of a P D A ). A configuration of a P D A ,   M = ( Q , Σ , Γ , δ , q 0 , , F ) is an element of Q × Σ * × Γ * describing the current state, the portion of the input still unread and the current stack contents at some point of a computation. For example, the configuration
( p , b a a a b b a , A B A C ) describes the situation as shown in the following diagram.
Preprints 161810 i004
Note that the portion of the input to the left of the input head, namely a b a b , has been read and cannot affect the computation hereon.
The start configuration on input w is defined as ( q 0 , w , ) . That is, the P D A always starts in its start state q 0 , with the input head pointing to the leftmost input symbol and the stack containing only the start stack symbol .
The next-configuration relation (denoted by 1 , M or simply M ) describes how the P D A moves from one configuration to another in one step. It is formally defined as follows.
Definition 2.47. Let M = ( Q , Σ , Γ , δ , q 0 , , F ) be a P D A .
p , q Q , a Σ ϵ , A Γ ϵ , y Σ * , β Γ * , γ Γ * ,
p a , A γ , δ q d e f p , a y , A β 1 , M ( q , y , γ β ) .
For any configurations C , D of M ,
C 0 , M D d e f C = D .
C n + 1 , M D d e f E C n , M E & E 1 , M D .
C * , M D d e f n 0 C n , M D .
Proposition 2.48. 
Let M = ( Q , Σ , Γ , δ , q 0 , , F ) be a P D A .
For any w = w 1 w 2 w m where w 1 , w 2 w m Σ ϵ , M accepts w iff q 0 , w , * , M ( q , ϵ , γ ) for some q F and γ Γ * .
Proof. 
By Definition 2.44, M accepts w = w 1 w 2 w m iff r 0 ,   r 1 r m Q , a i Γ ϵ , b i Γ * for 0 i m 1 such that
q 0 = r 0 w 1 , a 0 b 0 r 1 w 2 , a 1 b 1 r 2 r i w i + 1 , a i b i r i + 1 r m 1 w m , a m 1 b m 1 r m , r m F .
By Definition 2.47, each one-step transitional movement is equivalent to one step of configuration movement.
For all 0 i m 1 , configurations C i and C i + 1 such that
r i w i + 1 , a i b i r i + 1 C i 1 , M C i + 1 The above transitional computation is equivalent to
C 0 1 , M C 1 1 , M C 2 C i 1 , M C i + 1 C m 1 1 , M C m , r m F .
That is, C 0 n , M C m , r m F .
That is, C 0 * , M C m , r m F .
Since C 0 = ( q 0 , w , ) and C m = ( r m , ϵ , γ ) where γ Γ * is the final stack content, we have
( q 0 , w , ) * , M ( r m , ϵ , γ ) , r m F .
Therefore, q 0 , w , * , M ( q , ϵ , γ ) for some q F and γ Γ * .
This completes the proof of Proposition 2.48.
Proposition 2.49. 
Let M = ( Q , Σ , Γ , δ , q 0 , , F ) and M ' = ( Q ' , Σ , Γ ' , δ ' , q 0 ' , ' , F ' ) be two P D A s such that
Q Q ' , Γ Γ ' and
δ ' q , a , A = δ q , a , A q , a , A Q × Σ ϵ × Γ ϵ .
p , q Q , u , v Σ * and α , β Γ * , the following statements hold:
(a) p , u , α 1 , M q , v , β p , u , α 1 , M ' q , v , β
(b) p , u , α n , M q , v , β p , u , α n , M ' q , v , β for any n 0
Proof.
(a)
p , u , α 1 , M q , v , β p a , A γ , δ q where u = a v , a Σ ϵ , α = A η , β = γ η , A Γ ϵ , γ , η Γ * .
Since Q Q ' , p , q Q p , q Q ' .
Since Γ Γ ' , A Γ ϵ A Γ ϵ ' .
Since Γ Γ ' , Γ * Γ ' * and hence γ Γ * γ Γ ' * .
Since δ ' p , a , A = δ p , a , A for all p , a , A Q × Σ ϵ × Γ ϵ , we have
p a , A γ , δ q p a , A γ , δ ' q
p , a v , A η , 1 , M ' q , v , γ η
p , u , α 1 , M ' q , v , β Conversely,
p , u , α 1 , M ' q , v , β where p , q Q , u , v Σ * and α , β Γ *
p a , A γ , δ ' q where p , q Q , u , v Σ * , α , β Γ * , a Σ ϵ , A Γ ϵ ' , γ Γ ' * , α = A η , β = γ η .
Since α Γ * & α = A η , A Γ ϵ and η Γ * .
Since β Γ * & β = γ η , γ Γ * and η Γ * .
Therefore p , a , A Q × Σ ϵ × Γ ϵ and hence δ ' p , a , A = δ p , a , A .
Therefore,
p a , A γ , δ ' q p a , A γ , δ q p , u , α 1 , M q , v , β .
(b)
This part can be proved by using the result of (a) along with an induction argument on the number of steps.
This completes the proof of Proposition 2.49.
Proposition 2.50. 
Let M = ( Q , Σ , Γ , δ , q 0 , , F ) be a P D A . It is true that p , q Q , x , y , w Σ * , α , β , γ Γ * ,   integer n 1 ,
p , x , α n , M ( q , y , β ) p , x w , α γ n , M ( q , y w , β γ )
Proof. 
The proof is by induction on n .
For n = 1 , assume p , x , α 1 , M ( q , y , β ) .
a Σ ϵ , A Γ ϵ and η , θ Γ * such that
x = a y , α = A η , β = θ η and p a , A θ , δ q .
Since p a , A θ , δ q , and x w = a y w , α γ = A η γ , and β γ = θ η γ ,
p , x w , α γ 1 , M ( q , y w , β γ ) .
Therefore, the statement is true for n = 1 .
For induction hypothesis,
p , x , α k , M ( q , y , β ) p , x w , α γ k , M ( q , y w , β γ ) for any integer k 1 .
For n = k + 1 , assume p , x , α k + 1 , M ( q , y , β ) .
p ' Q ,   x ' Σ * , α ' Γ * such that
p , x , α k , M ( p ' , x ' , α ' ) and ( p ' , x ' , α ' ) 1 , M ( q , y , β ) .
By induction hypothesis, we have
p , x w , α γ k , M ( p ' , x ' w , α ' γ ) .
Since the statement is true for n = 1 , we also have
( p ' , x ' w , α ' γ ) 1 , M ( q , y w , β γ ) .
Combining the two computations, we have
p , x w , α γ k + 1 , M ( q , y w , β γ ) .
This completes the proof of Proposition 2.50.
The P D A s that we have dealt with thus far accept an input by entering an accept state upon reading the entire input. We call this kind of P D A a P D A that accepts by final state. There is another kind of P D A that accepts an input by popping the last symbol off the stack (without pushing any other symbol back on) upon reading the entire input. We call this kind of P D A a P D A that accepts by empty stack. It turns out that the two kinds of P D A s are equivalent in that given one, we can construct the other such that the two recognize the same language. Before we prove the equivalence of these two kinds of P D A s, we need a formal definition for P D A s that accept by empty stack.
Definition 2.51. 
A P D A that accepts by empty stack is a 6-tuple, M e = ( Q , Σ , Γ , δ , q 0 , e ) where Q , Σ , Γ , δ , q 0 , e are defined similarly as in a P D A that accepts by final state.
M e computes as follows:
Let w = w 1 w 2 w m where w i Σ ϵ for 1 i m & m 1 .
M e accepts w iff q 0 , w , e * , M e ( q , ϵ , ϵ ) for any q Q .
(Note that the set of accept states, namely F , is not needed in the definition of acceptance by empty state.)
Lemma 2.52. For any P D A , M e , that accepts by empty stack, there is a P D A , M f , that accepts by final state such that L M e = L ( M f ) .
Proof. 
Let M e = ( Q , Σ , Γ , δ , q 0 , e ) where e Γ is the initial stack symbol of M e .
Construct M f = Q f , Σ , Γ f , δ f , q s t a r t , f , q a c c e p t where q s t a r t and q a c c e p t are newly created states (not in Q ) with q s t a r t serving as the start state of M f and q a c c e p t serving as the set of accept states of M f .
f is a newly created stack symbol (not in Γ ) serving as the initial stack symbol of M f .
Q f = Q { q s t a r t , q a c c e p t }
Γ f = Γ { f } The transition function δ f of M f is defined as follows.
T1: δ f q s t a r t , ϵ , f = q 0 , e f q s t a r t ϵ , f e f , δ f q 0 T2: δ f q , ϵ , f = q a c c e p t , ϵ q Q q ϵ , f ϵ , δ f q a c c e p t T3: δ f q , a , A = δ ( q , a , A ) for any q , a , A Q × Σ ϵ × Γ ϵ where
δ : Q × Σ ϵ × Γ ϵ ( Q × Γ * ) .
T4: δ f q , a , A = for any other q , a , A Q f × Σ ϵ × Γ f ϵ .
The construction is now complete. It remains to show L M e = L ( M f ) .
Suppose w L M e .
q 0 , w , e n , M e ( q , ϵ , ϵ ) for some n 0 & q Q .
By T1, δ f q s t a r t , ϵ , f = q 0 , e f .
Therefore, q s t a r t , w , f 1 , M f q 0 , w , e f .
By Proposition 2.49, we have
q 0 , w , e n , M e q , ϵ , ϵ q 0 , w , e n , M f q , ϵ , ϵ .
By Proposition 2.50, we have
q 0 , w , e n , M f q , ϵ , ϵ q 0 , w ϵ , e f n , M f q , ϵ ϵ , ϵ f .
That is, q 0 , w , e n , M f q , ϵ , ϵ q 0 , w , e f n , M f q , ϵ , f .
Also by T2, δ f q , ϵ , f = q a c c e p t , ϵ .
Therefore, q , ϵ , f 1 , M f q a c c e p t , ϵ , ϵ .
Combining, we have
q s t a r t , w , f 1 , M f q 0 , w , e f n , M f q , ϵ , f 1 , M f q a c c e p t , ϵ , ϵ .
Therefore, q s t a r t , w , f * , M f q a c c e p t , ϵ , ϵ .
Therefore, M f accepts w .
w L ( M f ) Therefore, L M e L ( M f ) .
Conversely, assume w L ( M f ) .
q s t a r t , w , f * , M f q a c c e p t , ϵ , γ for some γ Γ f * .
Since there exists no transition in one step to go from q s t a r t to q a c c e p t , there must exist configurations ( q 1 , u 1 , γ 1 ) , ( q 2 , u 2 , γ 2 ) , ( q i , u i , γ i ) , q n , u n , γ n where n 1 , u i Σ * , γ i   Γ f * for 1 i n , such that
q s t a r t , w , f 1 , M f ( q 1 , u 1 , γ 1 ) 1 , M f 1 , M f ( q i , u i , γ i ) 1 , M f 1 , M f q n , u n , γ n 1 , M f   q a c c e p t , ϵ , γ .
Note that q i q s t a r t because each q i has both incoming and outgoing arrows whereas q s t a r t has only outgoing arrows and q i q a c c e p t because q a c c e p t has only incoming arrows.
Therefore, for 1 i n , q i Q .
Claim 1. q 1 , u 1 , γ 1 = q 0 , w , e f .
δ f q s t a r t , ϵ , f = q 0 , e f by T1.
Therefore, q s t a r t , w , f 1 , M f q 0 , w , e f .
Since q s t a r t , w , f 1 , M f ( q 1 , u 1 , γ 1 ) , and by T4, δ f q s t a r t , a , A = for any other combination of a , A ( ϵ , f ) , we must have
q 1 , u 1 , γ 1 = q 0 , w , e f .
Claim 2. For 1 i n , γ i ' Γ * such that γ i = γ i ' f .
Claim 2 can be proved by induction on i .
For i = 1 ,
q 1 , u 1 , γ 1 = q 0 , w , e f (By Claim 1)
Therefore, γ 1 = e f .
Take γ 1 ' = e .
γ 1 = γ 1 ' f .
e Γ e Γ * γ 1 ' Γ * .
The statement is true for i = 1 .
For induction hypothesis ( i = k ) , assume γ k = γ k ' f for 1 k n 1 , γ k ' Γ * .
Consider configuration move of
q k , u k , γ k 1 , M f q k + 1 , u k + 1 , γ k + 1 which is equivalent to q k a , b c , δ f q k + 1 where
a Σ ϵ , b Γ f ϵ , c Γ f * , u k = a u k + 1 , γ k = b γ k " , γ k + 1 = c γ k " , γ k " Γ f * .
Since 1 k < k + 1 n , q k , q k + 1 Q .
This configuration move could not have come from T1 because q k q s t a r t .
By induction hypothesis, γ k = γ k ' f & γ k ' Γ * .
We examine two situations: (i) γ k ' = ϵ and (ii) γ k ' ϵ .
(i) If γ k ' = ϵ
γ k = f
b = ϵ or b = f .
If b = f , q k a , f c , δ f q k + 1 .
This transition must have come from T2 where
δ f q k , ϵ , f = q a c c e p t , ϵ , a = ϵ , c = ϵ .
Therefore, q k + 1 = q a c c e p t , which contradicts q k + 1 Q .
Therefore, b = ϵ .
Therefore, q k , a , b = q k , a , ϵ Q × Σ ϵ × Γ ϵ .
By T3, δ f q k , a , b = δ q k , a , b .
Therefore, q k a , ϵ c , δ q k + 1 .
Therefore, c Γ * .
γ k = f = b γ k " .
Since b = ϵ , γ k " = f .
Therefore, γ k + 1 = c γ k " = c f .
The statement is true for i = k + 1 .
(ii) If γ k ' ϵ Since γ k ' Γ * , f is not a symbol in γ k ' .
Since γ k = γ k ' f , the leftmost symbol of γ k cannot be f .
Therefore, the configuration move of q k , u k , γ k 1 , M f q k + 1 , u k + 1 , γ k + 1 could not have come from T2.
Therefore, it must have come from T3 where δ = δ f .
Therefore, q k , u k , γ k 1 , M e q k + 1 , u k + 1 , γ k + 1 by Proposition 2.49.
Therefore, q k a , b c , δ q k + 1 where
a Σ ϵ , b Γ ϵ , c Γ * , u k = a u k + 1 , γ k = b γ k " , γ k + 1 = c γ k " , γ k " Γ f * .
Note that b f because f Γ ϵ .
By induction hypothesis, γ k = γ k ' f & γ k ' Γ * .
Therefore, γ k = γ k ' f = b γ k " .
γ k " = ϵ γ k ' f = b γ k ' f Γ ϵ , which is a contradiction because f Γ ϵ .
Therefore, γ k " ϵ .
The rightmost symbol of γ k " must be f .
γ k ' ' ' Γ f * such that γ k " = γ k ' ' ' f .
Therefore, γ k ' f = b γ k ' ' ' f .
Therefore, γ k ' = b γ k ' ' ' .
Since γ k ' Γ * by induction hypothesis, γ k ' ' ' Γ * .
γ k + 1 = c γ k " = c γ k ' ' ' f = γ k + 1 ' f where γ k + 1 ' = c γ k ' ' ' .
Since c Γ * & γ k ' ' ' Γ * , γ k + 1 ' Γ * .
The statement is also true for i = k + 1 .
Therefore, the statement is true for i = k + 1 whether or not γ k ' = ϵ .
This completes the proof of Claim 2.
Claim 3. 
q n , u n , γ n = q n , ϵ , f .
We know from above that q n , u n , γ n 1 , M f   q a c c e p t , ϵ , γ .
The only way to transition from a state in Q to q a c c e p t is via T2 where
δ f q n , ϵ , f = q a c c e p t , ϵ .
Equivalently, q n ϵ , f ϵ , δ f q a c c e p t .
Therefore, q n , u n , γ n 1 , M f   q a c c e p t , u n , γ n " where γ n = f γ n " & γ n " Γ f * .
Therefore, q a c c e p t , u n , γ n " = q a c c e p t , ϵ , γ .
Therefore, u n = ϵ & γ = γ n " .
By Claim 2, γ n = γ n ' f where γ n ' Γ * .
Therefore, γ n = γ n ' f = f γ n " .
If γ n " ϵ , its rightmost symbol must be f .
Let γ n " = γ n ' ' ' f for some γ n ' ' ' Γ f * .
Therefore, γ n ' f = f γ n ' ' ' f .
Therefore, γ n ' = f γ n ' ' ' , which is a contradiction because by Claim 2, γ n ' Γ * but f is not in Γ * .
Therefore, γ n " = ϵ .
γ = γ n " = ϵ .
γ n = f γ n " = f .
q n , u n , γ n = q n , ϵ , f and Claim 3 is proved.
Claim 4. 
( q i , u i , γ i ' ) 1 , M e ( q i + 1 , u i + 1 , γ i + 1 ' ) for 1 i n 1 .
From above, we have q i , u i , γ i 1 , M f q i + 1 , u i + 1 , γ i + 1 .
By Claim 2, q i , u i , γ i ' f 1 , M f q i + 1 , u i + 1 , γ i + 1 ' f where
γ i ' Γ * , γ i + 1 ' Γ * , a Σ ϵ , u i = a u i + 1 , u i , u i + 1 Σ * .
Equivalently, q i a , b c , δ f q i + 1 where γ i ' f = b η & γ i + 1 ' f = c η , b Γ f ϵ , c Γ f * , η Γ f * .
Since q i q s t a r t , the above computation could not have come from T1.
Since q i + 1 q a c c e p t , the above computation could not have come from T2.
Therefore, it must have come from T3 where δ f q i , a , b = δ ( q i , a , b ) .
Therefore, b Γ ϵ & c Γ * and q i a , b c , δ q i + 1 .
Since γ i ' f = b η , the rightmost symbol of η must be f .
Therefore, η = θ f for some θ Γ f * .
Therefore, γ i ' f = b θ f and γ i + 1 ' f = c θ f .
Therefore, γ i ' = b θ and γ i + 1 ' = c θ .
Since γ i ' Γ * by Claim 2, θ Γ * .
Therefore, ( q i , u i , γ i ' ) 1 , M e ( q i + 1 , u i + 1 , γ i + 1 ' ) .
This completes the proof of Claim 4.
By Claim 4, we now have
( q 1 , u 1 , γ 1 ' ) 1 , M e ( q 2 , u 2 , γ 2 ' ) 1 , M e 1 , M e ( q n 1 , u n 1 , γ n 1 ' ) 1 , M f q n , u n , γ n ' .
By Claim 1, q 1 , u 1 , γ 1 = q 0 , w , e f .
Therefore, q 1 = q 0 , u 1 = w , γ 1 = e f .
By Claim 2, γ 1 = γ 1 ' f .
Therefore, γ 1 ' f = e f .
Therefore, γ 1 ' = e .
Therefore, q 1 , u 1 , γ 1 ' = q 0 , w , e .
By Claim 3, q n , u n , γ n = q n , ϵ , f .
u n = ϵ & γ n = f .
By Claim 2, γ n = γ n ' f .
Therefore, γ n ' f = f .
Therefore, γ n ' = ϵ .
q n , u n , γ n ' = q n , ϵ , ϵ .
Therefore,
q 0 , w , e 1 , M e ( q 2 , u 2 , γ 2 ' ) 1 , M e 1 , M e ( q n 1 , u n 1 , γ n 1 ' ) 1 , M f q n , ϵ , ϵ .
q 0 , w , e * , M e q n , ϵ , ϵ .
M e accepts w .
w L ( M e ) .
w L M e w L ( M f ) .
This completes the proof of Lemma 2.52.
Lemma 2.53. For any P D A , M f , that accepts by final state, there is a P D A , M e , that accepts by empty stack such that L M e = L ( M f ) .
Proof. Let M f = ( Q , Σ , Γ , δ , q 0 , f , F ) where f Γ is the initial stack symbol of M f .
Construct M e = Q e , Σ , Γ e , δ e , q s t a r t , e where
Q e = Q { q s t a r t , q e m p t y } ; Γ e = Γ e ;
q s t a r t and q e m p t y are newly created states (not in Q ) with q s t a r t serving as the start state of M e and q e m p t y serving as the state in which M e begins the process of emptying the stack (without further consuming input);
e is a newly created stack symbol (not in Γ ) serving as the initial stack symbol of M e .
The transition function δ e of M e is defined as follows.
T1: δ e q s t a r t , ϵ , e = q 0 , f e q s t a r t ϵ , e f e , δ e q 0 T2:   q , a , A Q × Σ ϵ × Γ ϵ where δ : Q × Σ ϵ × Γ ϵ ( Q × Γ * ) , δ e q , a , A = δ ( q , a , A ) .
T3:   q F , δ e q , ϵ , ϵ = q e m p t y , ϵ q ϵ , ϵ ϵ , δ e q e m p t y .
T4: A Γ e ϵ , δ e q e m p t y , ϵ , A = q e m p t y , ϵ q e m p t y ϵ , A ϵ , δ e q e m p t y .
T5: For any other q , a , A Q e × Σ ϵ × Γ e ϵ , δ e q , a , A = .
The construction is now complete. It remains to show L M e = L ( M f ) .
Suppose w L M f .
q 0 , w , f n , M f ( q , ϵ , γ ) for some n 0 & q F , γ Γ * .
By T1, δ e q s t a r t , ϵ , e = q 0 , f e .
Therefore, q s t a r t , w , e 1 , M e q 0 , w , f e .
By Proposition 2.49, we have
q 0 , w , f n , M f q , ϵ , γ q 0 , w , f n , M e q , ϵ , γ . ( Q Q e & δ e q , a , A = δ ( q , a , A ) )
By Proposition 2.50, we have
q 0 , w , f n , M e q , ϵ , γ q 0 , w ϵ , f e n , M e q , ϵ ϵ , γ e .
That is, q 0 , w , f n , M e q , ϵ , γ q 0 , w , f e n , M e q , ϵ , γ e .
Therefore, q s t a r t , w , e 1 , M e q 0 , w , f e n , M e q , ϵ , γ e .
Therefore, q s t a r t , w , e * , M e q , ϵ , γ e .
By T3, q , ϵ , γ e 1 , M e ( q e m p t y , ϵ , γ e ) .
By repeated application of T4, ( q e m p t y , ϵ , γ e ) * , M e ( q e m p t y , ϵ , ϵ ) .
Combined, q s t a r t , w , e * , M e q , ϵ , γ e 1 , M e ( q e m p t y , ϵ , γ e ) * , M e ( q e m p t y , ϵ , ϵ ) .
Therefore, q s t a r t , w , e * , M e ( q e m p t y , ϵ , ϵ ) .
Therefore, M e accepts w .
w L ( M e ) Therefore, L M f L ( M e ) .
Conversely, assume w L ( M e ) .
There exist configurations ( q 1 , u 1 , γ 1 ) , ( q 2 , u 2 , γ 2 ) , ( q i , u i , γ i ) , q n , u n , γ n where
n 0 , u i Σ * , γ i   Γ e * for 1 i n , q 1 , q 1 q n Q e such that
q s t a r t , w , e 1 , M e ( q 1 , u 1 , γ 1 ) 1 , M e 1 , M e ( q i , u i , γ i ) 1 , M e 1 , M e q n , u n , γ n 1 , M e   q , ϵ , ϵ .
Note that q s t a r t { q 1 , q 1 q n } because q s t a r t doesn’t have incoming arrows.
If n = 0 , q s t a r t , w , e 1 , M e q , ϵ , ϵ .
By T1, q s t a r t , w , e 1 , M e q 0 , w , f e and this is the only configuration move out of q s t a r t , w , e because δ e q s t a r t , ϵ , e = q 0 , f e .
Therefore, q , ϵ , ϵ = q 0 , w , f e , which is a contradiction because f e ϵ .
Therefore, n 1 .
In addition, we have q 1 , u 1 , γ 1 = q 0 , w , f e .
To move from q 0 , w , f e to the final configuration q , ϵ , ϵ , M e must pop e at some point which can only be done by T4 because T1 does not pop e and T2 and T3 cannot move on e .
Therefore, we must have a q e m p t y somewhere between q 0 , w , f e and q , ϵ , ϵ .
However, we can only transition into q e m p t y via a state p F (T3).
Therefore, there must be a p F somewhere between q 0 , w , f e and q , ϵ , ϵ .
m = Max { i | 1 i n ; q i F } .
Claim 1. For 1 i m , q i q e m p t y and hence γ i has e as its rightmost symbol.
To prove Claim 1, we assume for contradiction k such that 1 k m & q k = q e m p t y .
By T4, q k + 1 = q k + 2 = = q m = q e m p t y .
This contradicts q m F .
Therefore, q i q e m p t y for 1 i m .
As mentioned above, q 1 , u 1 , γ 1 = q 0 , w , f e .
Therefore, γ 1 = f e .
Since the only way to pop e is by using T4 that requires the existence of q e m p t y and we cannot have any q e m p t y between q 1 and q m , we must conclude that e remains sitting at the bottom of the stack as the machine moves from q 1 to q m . Therefore, γ i has e as its rightmost symbol for 1 i m .
Claim 2. For configuration q m , u m , γ m , u m = ϵ .
To prove Claim 2, we assume for contradiction u m ϵ .
At this configuration, there are two possible ways for the machine to move: (i) is to continue to simulate M f using T2 and (ii) is to enter q e m p t y using T3.
For (i), the machine will continue to read the input but will never enter an accept state again because it has passed q m which is the highest accept state in this computation. By the time the machine completes reading the entire input it will come to a stop and has never had a chance to enter the state q e m p t y . Thus, e remains sitting in the stack when everything comes to stop. This is contradictory to the assumption that the computation ends at q , ϵ , ϵ .
For (ii), the machine enters q e m p t y using T3 transition:
q m ϵ , ϵ ϵ , δ e q e m p t y .
q m + 1 = q e m p t y , u m + 1 = u m , γ m + 1 = γ m .
q m , u m , γ m 1 , M e q e m p t y , u m , γ m Once the machine has entered q e m p t y , it will follow T4, which is q e m p t y ϵ , A ϵ , δ e q e m p t y , to continue to pop symbols from the stack while remaining in q e m p t y and not reading any input.
Therefore, q m + 1 = q m + 2 = = q n = q = q e m p t y &
u m = u m + 1 = u m + 2 = = u n = ϵ (Last configuration is q , ϵ , ϵ ).
Therefore, both (i) and (ii) contradict the original assumption that u m ϵ .
Therefore, u m = ϵ .
Claim 3. 
For 1 i m , γ i ' Γ * such that γ i = γ i ' e .
The proof of Claim 3 is by induction on i .
We show at the beginning that q 1 , u 1 , γ 1 = q 0 , w , f e Therefore, γ 1 = f e .
Since f Γ Γ * , f Γ * .
If we take γ 1 ' = f , γ 1 = γ 1 ' e .
The statement is true for i = 1 .
For induction hypothesis, γ k = γ k ' e for 1 k m 1 , γ k ' Γ * .
We show at the beginning that q s t a r t { q 1 , q 2 q n } & q e m p t y { q 1 , q 2 q m } by Claim 1.
Therefore, q 1 , q 1 q m Q .
Consider configuration move of
q k , u k , γ k 1 , M e q k + 1 , u k + 1 , γ k + 1 .
This move could not have come from T3 or T4 because q k q e m p t y & q k + 1 q e m p t y .
The move must have come from T2 where δ e = δ .
By Proposition 2.49, we have
q k , u k , γ k 1 , M f q k + 1 , u k + 1 , γ k + 1 .
Therefore, q k a , b c , δ q k + 1 where
a Σ ϵ , b Γ ϵ , c Γ * , u k = a u k + 1 , γ k = b γ k " , γ k + 1 = c γ k " , γ k " Γ e * .
Note that b Γ ϵ b e .
By induction hypothesis, γ k = γ k ' e & γ k ' Γ * .
Therefore, γ k ' e = b γ k " .
γ k " = ϵ γ k ' e = b γ k ' = ϵ & ( b = e ) Contradiction.
Therefore, γ k " ϵ .
The rightmost symbol of γ k " must be e (because γ k ' e = b γ k " )
γ k ' ' ' Γ e * such that γ k " = γ k ' ' ' e .
Therefore, γ k ' e = b γ k ' ' ' e .
Therefore, γ k ' = b γ k ' ' ' .
Since γ k ' Γ * by induction hypothesis, γ k ' ' ' Γ * .
γ k + 1 = c γ k " = c γ k ' ' ' e = γ k + 1 ' e where γ k + 1 ' = c γ k ' ' ' .
Since c Γ * & γ k ' ' ' Γ * , γ k + 1 ' Γ * .
The statement is also true for i = k + 1 .
This completes the proof of Claim 3.
Claim 4. 
( q i , u i , γ i ' ) 1 , M f ( q i + 1 , u i + 1 , γ i + 1 ' ) for 1 i m 1 .
By assumption, q i , u i , γ i 1 , M e q i + 1 , u i + 1 , γ i + 1 .
By Claim 3, q i , u i , γ i ' e 1 , M e q i + 1 , u i + 1 , γ i + 1 ' e By Claim 1, q i q e m p t y & q i + 1 q e m p t y for 1 i m 1 .
Also, we point out at the beginning that q i q s t a r t for 1 i n .
Therefore, this computation must have come from T2 where δ e = δ .
By Proposition 2.49, q i , u i , γ i ' e 1 , M f q i + 1 , u i + 1 , γ i + 1 ' e .
Equivalently, q i a , b c , δ q i + 1 where
a Σ ϵ , b Γ ϵ , c Γ * , γ i ' e = b η & γ i + 1 ' e = c η , η Γ e * , u i = a u i + 1 .
Note that b Γ ϵ b e .
η = ϵ γ i ' e = b γ i ' = ϵ & ( b = e ) Contradiction.
Therefore, η ϵ .
The rightmost symbol of η must be e (because γ i ' e = b η ).
Let η = θ e where θ Γ e * .
Therefore, γ i ' e = b θ e & γ i + 1 ' e = c θ e .
Therefore, γ i ' = b θ & γ i + 1 ' = c θ .
Since γ i ' Γ * by Claim 3, θ Γ * .
c Γ * & θ Γ * γ i + 1 ' Γ * .
Therefore, ( q i , u i , γ i ' ) 1 , M f ( q i + 1 , u i + 1 , γ i + 1 ' ) .
This completes the proof of Claim 4.
By Claim 4, we now have,
( q 1 , u 1 , γ 1 ' ) 1 , M f ( q 2 , u 2 , γ 2 ' ) 1 , M f 1 , M f ( q m 1 , u m 1 , γ m 1 ' ) 1 , M f ( q m , u m , γ m ' ) .
q 1 , u 1 , γ 1 = q 0 , w , f e as shown at the beginning.
Therefore, γ 1 = f e .
By Claim 3, γ 1 = γ 1 ' e .
Therefore, γ 1 ' = f .
Therefore, q 1 , u 1 , γ 1 ' = q 0 , w , f .
By definition, q m F .
By Claim 2, u m = ϵ .
Therefore, q m , u m , γ m ' = ( q m , ϵ , γ m ' ) .
Therefore, q 0 , w , f * , M f ( q m , ϵ , γ m ' ) where q m F .
Therefore, M f accepts w .
w L ( M f ) .
L ( M e ) L ( M f ) .
This completes the proof of Lemma 2.53.
Combining Lemma 2.52 and Lemma 2.53, we have the following theorem.
Theorem 2.54. 
For any P D A , M e , that accepts by empty stack, there is a P D A , M f , that accepts by final state such that L M e = L ( M f ) .
Conversely, For any P D A , M f , that accepts by final state, there is a P D A , M e , that accepts by empty stack such that L M e = L ( M f ) .

2.4. Equivalence of C F G

and P D A In this section, we shall prove that context-free grammars and pushdown automata are equivalent in power in that any language that is context-free is recognized by a pushdown automata and vice versa.
Definition 2.54. 
Let G = ( V , Σ , R , S ) be a C F G .
Let A V , y ( V Σ ) * .
A is called the leftmost variable in y iff x Σ * and α ( V Σ ) * such that y = x A α .
x is called the head of y (written as x = H e a d ( y ) ), A α is called the body of y (written as A α = B o d y ( y ) ) and α is called the tail of y (written as α = T a i l ( y ) ).
It is clear from this definition that y = H e a d y A T a i l y = H e a d y B o d y ( y ) and if y Σ * , then H e a d y = y and B o d y y = T a i l y = ϵ .
Definition 2.55. Let G = ( V , Σ , R , S ) be a C F G .
x , y ( V Σ ) * , x is a prefix of y (written as x P R E y ) iff z ( V Σ ) * such that x z = y .
Proposition 2.56. P R E is a reflexive and transitive relation from ( V Σ ) * to ( V Σ ) * .
Proof. 
x ( V Σ ) * , ϵ ( V Σ ) * and x ϵ = x .
Therefore, x P R E x and P R E is a reflexive.
x , y , z ( V Σ ) * , if x P R E y and y P R E z ,
x ' , y ' ( V Σ ) * such that x x ' = y and y y ' = z .
Therefore, x x ' y ' = z .
Since x ' y ' ( V Σ ) * , x P R E z .
Therefore, P R E is transitive.
This completes the proof of Proposition 2.56.
Proposition 2.57. 
Let G = ( V , Σ , R , S ) be a C F G .
Let w Σ * , γ i ( V Σ ) * for all i { 1,2 , 3 , n } .
Let A i be the leftmost variable in γ i and A i β i be the rule such that γ i A i β i , l m γ i + 1 for all i { 1,2 , 3 , n } .
Let S = γ 1 l m γ 2 l m γ 3 l m γ n = w .
The following statements are true:
(a) H e a d γ i P R E H e a d ( γ i + 1 )
(b) 1 i < j n , H e a d γ i P R E H e a d ( γ j ) & hence
H e a d γ 1 P R E H e a d γ 2 P R E P R E H e a d γ n = w
(c) H e a d γ i + 1 = H e a d γ i H e a d β i & B o d y γ i + 1 = B o d y β i T a i l ( γ i )
(d) If y i ( V Σ ) * such that H e a d γ i y i = w , then H e a d β i P R E y i .
Proof. 
Since γ i A i β i , l m γ i + 1 , & γ i = H e a d γ i A i T a i l ( γ i ) , we have γ i + 1 = H e a d γ i β i T a i l ( γ i ) .
Therefore, γ i + 1 = H e a d γ i H e a d β i B i T a i l ( β i ) T a i l ( γ i ) where B i is the leftmost variable in β i .
Since H e a d γ i Σ * & H e a d β i Σ * , B i is also the leftmost variable in γ i + 1 .
Therefore, H e a d γ i + 1 = H e a d γ i H e a d β i .
Therefore, H e a d γ i P R E H e a d γ i + 1 .
This follows (a) to be true because P R E is transitive.
H e a d γ i + 1 = H e a d γ i H e a d β i is established in the proof of (a).
It is also established in the proof of (a) that γ i + 1 = H e a d γ i H e a d β i B i T a i l ( β i ) T a i l ( γ i ) .
B o d y γ i + 1 = B i T a i l ( β i ) T a i l ( γ i ) .
= B o d y β i T a i l ( γ i ) ( B i is the leftmost variable in β i )
From (b), H e a d γ i + 1 P R E H e a d γ n = w .
Therefore, y i + 1 Σ * such that H e a d γ i + 1 y i + 1 = w .
Therefore, H e a d γ i + 1 y i + 1 = H e a d γ i y i .
By (c), H e a d γ i H e a d β i y i + 1 = H e a d γ i y i .
Therefore, H e a d β i y i + 1 = y i .
Therefore, H e a d β i P R E y i .
Lemma 2.58. For any C F G   G , a P D A   M e such that L G = L ( M e ) .
Proof. 
Let G = ( V , Σ , R , S ) be a C F G .
Construct M e = ( q , Σ , V Σ , δ , q , S ) where δ is the transition function defined as follows.
T1: δ q , ϵ , A = { ( q , β ) | A β is a rule in R } .
T2: δ q , a , a = { q , ϵ }   a Σ ϵ .
T3: For all other q , a , A q × Σ ϵ × V Σ ϵ , δ q , a , A = .
Note that the start variable of G is the start stack symbol of M e .
It remains to show L G = L ( M e ) .
To prove L G L ( M e ) , suppose w L ( G ) .
γ 1 , γ 2 , γ n V Σ * such that
S = γ 1 l m γ 2 l m γ 3 γ i l m γ i + 1 l m γ n = w .
i { 1,2 , 3 , n } , H e a d γ i P R E w by Proposition 2.57(b).
Therefore, y i such that H e a d γ i y i = w .
Claim. 
i { 1,2 , 3 , n } , q , w , S * , M e q , y i , B o d y γ i where H e a d γ i y i = w .
This Claim can be proved by induction i .
For i = 1 , S = γ 1 because S = γ 1 l m γ 2 l m γ 3 γ i l m γ i + 1 l m γ n = w .
H e a d γ 1 = H e a d S = ϵ . B o d y γ 1 = B o d y S = S .
q , w , S 0 , M e q , w , S = ( q 1 , y 1 , α 1 ) .
Therefore, q = q 1 , y 1 = w & α 1 = S .
Therefore, q = q 1 , H e a d γ 1 y 1 = ϵ w = w & α 1 = B o d y γ 1 .
q , w , S * , M e ( q , y 1 , B o d y γ 1 .
The statement is true for i = 1 .
For induction hypothesis, we have
q , w , S * , M e q , y k , B o d y γ k where H e a d γ k y k = w for 1 k < n 1 .
Let A k be the leftmost variable in γ k .
Since γ k l m γ k + 1 , A k β k where β k V Σ * .
q , w , S * , M e q , y k , B o d y γ k = q , y k , A k T a i l γ k
1 , M e   q , y k , β k T a i l γ k (by T1)
= q , y k , H e a d ( β k ) B o d y ( β k ) T a i l γ k By Proposition 2.57, H e a d β k P R E y k .
y k + 1 ( V Σ ) * such that H e a d β k y k + 1 = y k .
Therefore, q , y k , H e a d ( β k ) B o d y ( β k ) T a i l γ k = q , H e a d β k y k + 1 , H e a d ( β k ) B o d y ( β k ) T a i l γ k
H e a d ( β k ) , M e q , y k + 1 , B o d y ( β k ) T a i l γ k (by repeated applications of T2 H e a d ( β k ) times)
= q , y k + 1 , B o d y γ k + 1 (by Proposition 2.57(c))
Therefore, q , w , S * , M e q , y k + 1 , B o d y γ k + 1 .
Since H e a d β k y k + 1 = y k ,
H e a d γ k H e a d β k y k + 1 = H e a d γ k y k .
By Proposition 2.57(c), H e a d γ k + 1 = H e a d γ k H e a d β k .
By induction hypothesis, H e a d γ k y k = w .
Therefore, H e a d γ k + 1 y k + 1 = w .
Therefore, the statement is true for   i = k + 1 .
To complete the proof of L G L ( M e ) , set i = n in Claim.
q , w , S * , M e q , y n , B o d y γ n where H e a d γ n y n = w .
Since γ n = w , H e a d γ n = H e a d w = w & B o d y γ n = B o d y w = ϵ .
( H e a d γ n = w ) & ( H e a d γ n y n = w ) w y n = w y n = ϵ .
Therefore, q , w , S * , M e q , ϵ , ϵ .
w L ( M e ) .
Therefore, L G L ( M e ) .
(Note that at this final configuration of q ,   ϵ , ϵ , we could have used the transition, δ q , ϵ , ϵ = { q , ϵ } or q ϵ , ϵ ϵ , δ q , to loop on without stopping. However, this machine is nondeterministic, which means that we don’t have to take an option which is a bad one. On the other hand, if bad choices are made, we can loop ourselves to infinity.)
To prove L ( M e ) L G , let w L ( M e ) .
q , w , S * , M e q , ϵ , ϵ .
Claim. 
x Σ * , if ( q , x , A ) * , M e q , ϵ , ϵ , then A * x .
Proof of this Claim is by induction on the number of steps.
n 1 , such that ( q , x , A ) n , M e q , ϵ , ϵ .
For n = 1 , ( q , x , A ) 1 , M e q , ϵ , ϵ .
Since A V , we must use T1 and that is, δ q , ϵ , A = { ( q , β ) | A β is a rule in R } .
Therefore, q , x , A 1 , M e q , x , β = q , ϵ , ϵ .
Therefore, x = β = ϵ .
Therefore, A ϵ is a rule.
A ϵ by Proposition 2.8(i).
Therefore, A * ϵ .
The statement is true for n = 1 .
For induction hypothesis, assume the statement is true for all n k with k 1 .
That is, if ( q , x , A ) n , M e q , ϵ , ϵ , then A * x for all n k .
For n = k + 1 , assume ( q , x , A ) k + 1 , M e q , ϵ , ϵ .
Since A V , the first move must be based on T1.
Therefore, q , x , A 1 , M e ( q , x , Y 1 Y 2 Y m ) k , M e q , ϵ , ϵ where
A Y 1 Y 2 Y m & Y i V Σ for i { 1,2 , m } .
Since ( q , x , Y 1 Y 2 Y m ) k , M e q ,   ϵ , ϵ , the machine must pop all the Y i s off the stack by the time it finishes reading input x and empties the stack.
Let x i be the portion of x that the machine consumes while popping Y i off the stack and returning its stack head to the position right before popping Y i + 1 off for i = 1,2 , 3 , m 1 .
Let also x m be the last portion of x that the machine consumes while popping Y m off the stack and emptying the stack eventually.
Note that if Y i is a terminal, x i = Y i . The P D A will pop Y i using T2 and then scan the same symbol x i from the input. The stack head will then point at Y i + 1 .
By these assumptions, we have x = x 1 x 2 x m .
In addition, we have the following sequence of computations: ( q , x 1 x 2 x m , Y 1 Y 2 Y m ) * , M e ( q , x 2 x 3 x m , Y 2 Y 3 Y m ) * , M e ( q , x 3 x 4 x m , Y 3 Y 4 Y m )   * , M e * , M e q , x i x i + 1 x m , Y i Y i + 1 Y m * , M e q , x i + 1 x m , Y i + 1 Y m
* , M e * , M e ( q , x m , Y m ) * , M e q , ϵ , ϵ .
Since the stack head does not go below Y i + 1 while the P D A consumes x i , we have the following equivalent computations:
( q , x 1 , Y 1 ) * , M e ( q , ϵ , ϵ )
( q , x 2 , Y 2 ) * , M e ( q , ϵ , ϵ )
                       
( q , x i , Y i ) * , M e ( q , ϵ , ϵ )
                       
( q , x m , Y m ) * , M e ( q , ϵ , ϵ )
Since the sum of all the numbers of steps in all these computations is equal to k , the number of steps in each of these computations is less than or equal to k .
Therefore, we can use the induction hypothesis to derive the following:
Y 1 * x 1
Y 2 * x 2
             
Y i * x i
             
Y m * x m Since A Y 1 Y 2 Y m , A Y 1 Y 2 Y m by Proposition 2.8(i).
Since Y i * x i   i { 1,2 , m } , Y 1 Y 2 Y m * x 1 x 2 x m by Proposition 2.16(d).
Therefore, A * x 1 x 2 x m .
Therefore, A * x .
The statement is true for n = k + 1 .
To complete the proof of L ( M e ) L G , put A = S & x = w in Claim.
q , w , S * , M e q , ϵ , ϵ [ S * w ] .
w L ( M e ) q , w , S * , M e q , ϵ , ϵ [ S * w ] w L ( G ) .
This completes the proof of L ( M e ) L G and hence the proof of Lemma 2.58.
Lemma 2.59. 
For any P D A M e , a C F G G such that L G = L ( M e ) .
Proof.
Let M e = Q , Σ , Γ , δ , q 0 , e be a P D A that accepts by empty stack.
Construct C F G   G = ( V , Σ , R , S ) where V & R are defined as follows.
V = S { [ p X q ] | p , q Q , X Γ } Note that V is finite because Q & Γ are finite.
Let ( P ) be the procedure for creating rules in R defined as follows.
( q , a , X ) Q × Σ ϵ × Γ , if δ ( q , a , X ) , ( r 0 , Y 1 Y 2 Y m ) δ ( q , a , X ) .
That is, q a , X Y 1 Y 2 Y m r 0 .
For every r 1 , r 2 , r m Q , a Σ ϵ , let
[ q X r m ] a [ r 0 Y 1 r 1 ] [ r 1 Y 2 r 2 ] [ r m 1 Y m r m ] be a rule in R .
Note that the total number of rules thus created based on each
( r 0 , Y 1 Y 2 Y m ) ) δ ( q , a , X ) is finite because Q , m , & Σ ϵ are finite.
Furthermore, the set δ ( q , a , X ) is finite & the total number of such sets, δ ( q , a , X ) is finite because the total number of ( q , a , X ) Q × Σ ϵ × Γ is finite.
Therefore, the total number of rules thus created for any given P D A , M e is finite.
Let R 1 = the set of rules created by ( P ) .
R 2 = { S [ q 0 e p ] | p Q } .
R = R 1 R 2 .
The construction of G is complete and we now proceed to prove L G = L ( M e ) .
Claim 1. S * w iff [ q 0 e p ] * w for some p Q .
<Proof of Claim 1>
" If S * w "
n 1 such that S n w .
S 1 β n 1 w where β ( V Σ ) * .
By Proposition 2.8(i), S β is a rule.
This rule must be from R 2 .
Therefore, S q 0 e p for some p Q .
Therefore, S 1 q 0 e p n 1 w .
Therefore, q 0 e p n 1 w .
Therefore, q 0 e p * w for some p Q .
" If q 0 e p * w f o r s o m e p Q " By construction, S q 0 e p is a rule in R 2 .
By Proposition 2.8(i), S 1 q 0 e p .
Therefore, S 1 q 0 e p * w .
Therefore, S * w .
This completes the proof of Claim 1.
Claim 2. p , q Q , X Γ , w Σ * , [ q X p ] * w iff ( q , w , X ) * , M e p , ϵ , ϵ .
<Proof of Claim 2>
“If”
Assume ( q , w , X ) * , M e p , ϵ , ϵ .
n 1 such that ( q , w , X ) n , M e p , ϵ , ϵ .
The proof of [ q X p ] * w is by induction on n .
For n = 1 , ( q , w , X ) 1 , M e p , ϵ , ϵ Therefore, q a , X ϵ p where a Σ ϵ & w = a ϵ = a .
By ( P ) , if q a , X Y 1 Y 2 Y m r 0 , then a rule [ q X r m ] a [ r 0 Y 1 r 1 ] [ r 1 Y 2 r 2 ] [ r m 1 Y m r m ] for some r 1 , r 2 , r m Q .
In this case, Y 1 Y 2 Y m = ϵ which means m = 0 and hence r m = r 0 .
Therefore, a rule [ q X r 0 ] a .
Since p = r 0 & w = a , the rule becomes [ q X p ] w .
By Proposition 2.8(i), [ q X p ] w .
Therefore, [ q X p ] * w .
The statement is true for n = 1 .
Assume the statement is true for all n k where k 1 .
That is, q , w , X n , M e p , ϵ , ϵ q X p * w for all n k .
For n = k + 1 , assume q , w , X k + 1 , M e p , ϵ , ϵ .
Y 1 , Y 2 , Y m Γ , a Σ ϵ , x Σ * , w = a x , r 0 Q such that q a , X Y 1 Y 2 Y m r 0 .
Therefore, q , w , X 1 , M e ( r 0 , x , Y 1 Y 2 Y m ) k , M e p , ϵ , ϵ .
Since ( r 0 , x , Y 1 Y 2 Y m ) k , M e p ,   ϵ , ϵ , using the same argument as used in the proof of Lemma 2.58, we can deduce the following computations:
( r 0 , x 1 , Y 1 ) * , M e r 1 , ϵ , ϵ
( r 1 , x 2 , Y 2 ) * , M e r 2 , ϵ , ϵ
             
( r i 1 , x i , Y i ) * , M e r i , ϵ , ϵ
             
( r m 1 , x m , Y m ) * , M e r m , ϵ , ϵ where
r 1 ,   r 2 r m 1 Q , r m = p ,
x i is the portion of x that the machine consumes while popping Y i off the stack and returning its stack head to the position right before popping Y i + 1 off for i = 1,2 , 3 , m 1 and x m is the last portion of x that the machine consumes while popping Y m off the stack and emptying the stack eventually.
Note that the machine goes from state r i 1 to state r i after completing the above actions & x = x 1 x 2 x m .
Since each computation ( r i 1 , x i , Y i ) * , M e r i , ϵ , ϵ is part of the computation ( r 0 , x , Y 1 Y 2 Y m ) k , M e p , ϵ , ϵ , each one makes no more than k moves.
By induction hypothesis, r i 1 Y i r i * x i for i = 1,2 , m .
As shown above, q a , X Y 1 Y 2 Y m r 0 & since r 1 ,   r 2 r m 1 , p Q , by ( P ) ,
a rule [ q X p ] a [ r 0 Y 1 r 1 ] [ r 1 Y 2 r 2 ] [ r m 1 Y m p ] .
By Proposition 2.8(i), [ q X p ] a [ r 0 Y 1 r 1 ] [ r 1 Y 2 r 2 ] [ r m 1 Y m p ] .
Since a 0 a & r i 1 Y i r i * x i for i = 1,2 , m .
a [ r 0 Y 1 r 1 ] [ r 1 Y 2 r 2 ] [ r m 1 Y m p ] * a x 1 x 2 x m by Proposition 2.16(d).
Therefore, [ q X p ] * a x 1 x 2 x m .
Since w = a x = a x 1 x 2 x m , [ q X p ] * w .
Therefore, q , w , X k + 1 , M e p , ϵ , ϵ q X p * w Therefore, the statement is true for n = k + 1 .
This completes the proof of the “If” part of Claim 2.
“Only if”
Assume q X p * w .
n 1 such that q X p n w .
The proof of q , w , X * , M e p ,   ϵ , ϵ is by induction on n .
For n = 1 , q X p 1 w .
By Proposition 2.8(i), q X p w is a rule in R 1 (It’s not in R 2 because q X p S ).
Since every rule in R 1 is of the form [ q X r m ] a [ r 0 Y 1 r 1 ] [ r 1 Y 2 r 2 ] [ r m 1 Y m r m ] where
q a , X Y 1 Y 2 Y m r 0 & r 1 ,   r 2 r m 1 , r m Q .
In this particular case, w is not a variable.
Therefore, r 0 Y 1 r 1 r 1 Y 2 r 2 r m 1 Y m r m = ϵ or m = 0 .
Therefore, a = w & r m = r 0 .
Since q X p = [ q X r m ] , p = r m = r 0 .
We must have q a , X ϵ p .
Therefore, q , w , X 1 , M e p , x , ϵ where w = a x .
As shown above, a = w .
Therefore, x = ϵ .
Therefore, q , w , X 1 , M e p , ϵ , ϵ .
Therefore, q , w , X * , M e p , ϵ , ϵ .
The statement is true for n = 1 .
For induction hypothesis, assume it is true that
q X p n w q , w , X * , M e p , ϵ , ϵ for all n k where k 1 .
For n = k + 1 , assume q X p k + 1 w .
Therefore, q X p 1 β k w where β ( V Σ ) * .
By Proposition 2.8(i), q X p β is a rule in R 1 .
This rule must be of the form [ q X p ] a [ r 0 Y 1 r 1 ] [ r 1 Y 2 r 2 ] [ r m 1 Y m p ] where
r 0 , r 1 ,   r 2 r m 1 Q , a Σ ϵ , Y 1 , Y 2 , Y m Γ , q a , X Y 1 Y 2 Y m r 0 .
q X p 1 a [ r 0 Y 1 r 1 ] [ r 1 Y 2 r 2 ] [ r m 1 Y m p ] by Proposition 2.8(i).
Therefore, q X p 1 a [ r 0 Y 1 r 1 ] [ r 1 Y 2 r 2 ] [ r m 1 Y m p ] k w .
By Proposition 2.28(ii), w 1 , w 2 w m Σ * such that [ r i 1 Y i r i ] * w i in no more than k steps for i = 1,2 , m and w = a w 1 w 2 w m .
By induction hypothesis, ( r i 1 , w i , Y i ) * , M e r i , ϵ , ϵ for i = 1,2 , m .
By Proposition 2.50,
( r i 1 , w i w i + 1 w m , Y i Y i + 1 Y m ) * , M e r i , w i + 1 w m , Y i + 1 Y m for i = 1,2 , m .
( r 0 , w 1 w 2 w m , Y 1 Y 2 Y m ) * , M e r 1 , w 2 w m , Y 2 Y m for i = 1 .
( r 1 , w 2 w m , Y 2 Y m ) * , M e r 2 , w 3 w m , Y 3 Y m for i = 2 .
             
( r m 1 , w m , Y m ) * , M e r m , ϵ , ϵ for i = m 1 where r m = p .
Furthermore, as shown above, q a , X Y 1 Y 2 Y m r 0 .
Therefore, ( q , a w 1 w 2 w m , X ) 1 , M e ( r 0 , w 1 w 2 w m , Y 1 Y 2 Y m ) .
Since w = a w 1 w 2 w m , ( q , w , X ) 1 , M e ( r 0 , w 1 w 2 w m , Y 1 Y 2 Y m ) .
Connecting all the computations, we have
q , w , X 1 , M e r 0 , w 1 w 2 w m , Y 1 Y 2 Y m * , M e r 1 , w 2 w m , Y 2 Y m
* , M e r 2 , w 3 w m , Y 3 Y m * , M e * , M e ( r m 1 , w m , Y m ) * , M e p , ϵ , ϵ .
Therefore, q , w , X * , M e p , ϵ , ϵ .
Therefore, q X p k + 1 w q , w , X * , M e p , ϵ , ϵ .
This completes the proof of Claim 2.
We now get back to the proof of L G = L ( M e ) .
w L ( G ) S * W
[ q 0 e p ] * w for some p Q (Claim 1)
( q 0 , w , e ) * , M e p , ϵ , ϵ (Claim 2)
M e accepts w
w L ( M e ) Therefore, L G = L ( M e ) .
This completes the proof of Lemma 2.59.
Combining Lemma 2.58 and Lemma 2.59, we have the following theorem.
Theorem 2.60. For any C F G G , a P D A   M e such that L G = L ( M e ) .
Conversely, for any   P D A   M e , a C F G   G such that L G = L ( M e ) .

2.5. The Pumping Lemma for Context Free Languages

In this section, we shall develop a tool for showing that a language is not context free. This tool is called “The Pumping Lemma for context free languages.” This Pumping Lemma is analogous to the pumping lemma we study in Chapter 1 for regular languages. The difference this time is we are pumping two strings rather than one and the string that we are dealing with is broken down into five substrings in contrast to three substrings in the case of regular languages.
Theorem 2.61. 
Let G = ( V ,   Σ ,   R ,   S ) be a C F G in Chomsky Normal Form.
Let P t ( A , w , h ) be the parse tree corresponding to this grammar in accordance with the meaning of Theorem 2.33 where A ( V ) is the root, w ( Σ * ) is the yield and h is the height of the parse tree. Then it is true that w 2 h 1 .
Proof. 
The proof of this theorem is by induction on h .
For h = 1 , P t ( A , w , h ) is a 1-level tree with A at the zero level and w at the first level.
The only forms of rules in Chomsky Normal Form are:
A B C where A ϵ V and B , C ϵ V \ { S }
A a where a ϵ Σ Σ *
S ϵ where S = Start Variable.
Since w Σ * , we have either A a or S ϵ .
Therefore, w = a or w = ϵ .
w = a w = 1 w = 2 0 2 h 1 .
w = ϵ w = 0 w 2 0 2 h 1 .
Either case, we have statement being true for h = 1 .
For induction hypothesis, assume the statement is true for all h k where k 1 .
Consider a parse tree, P t ( A , w , k + 1 ) , that correspond to G according to the meaning of Theorem 2.33.
Since k 1 , the height of P t ( A , w , k + 1 ) is greater than or equal to 2 . Hence the children of A which appear in the first level cannot be a or ϵ .
They must be B and C with B , C ϵ V \ { S } .
Using similar argument as we use in proving Theorem 2.33, we can show the following:
(i) The combination of all branches of B (respectively C ) form a subtree P t ( B , w 1 , h 1 ) (respectively   P t ( C , w 2 , h 2 ) )
(ii) h 1 k & h 2 k
(iii) w = w 1 w 2 .
By (ii) and induction hypothesis, w 1 2 h 1 1 & w 2 2 h 2 1 .
w = w 1 + w 2
2 h 1 1 + 2 h 2 1
2 k 1 + 2 k 1
= 2 2 k 1
= 2 k
= 2 k + 1 1 This completes the induction proof of Theorem 2.61.
Proposition 2.62. 
Let P t ( A , z , h ) be a parse tree for C F G , G = ( V ,   Σ ,   R ,   S ) and P t ( B , w , k ) be the largest subtree of P t ( A , z , h ) where w , z Σ * . Then x , y Σ * such that z = x w y . Furthermore, the nodes on any path from A to x (respectively y ) cannot be a node in P t ( B , w , k ) .
Proof. By T13, every leaf of a subtree is also a leaf of the parent tree.
Therefore, w z .
Therefore, z = x w y for some x , y Σ * .
Let ( A , v 1 , v 2 , v n , l ) be a path from A to   l where l is a symbol in x .
There exists a v i ( i = 1,2 , n ) on this path such that v i and B are at the same level.
If v i and B are the same node, ( v i , v n , l ) is a branch rooted at B and by T11, it is a path inside P t ( B , w , k ) .
This means that l is a symbol in w and this contradicts the assumption that l is a symbol in x .
Therefore, v i cannot be the same node as B .
Since l is to the left of every symbol in w and v i is an ancestor of l , by T12, v i is to the left of B .
Let v j be a node on the path ( A , v 1 , v 2 , v i v n , l ) .
If j < i , v j is above the level of B and hence v j is not a node in P t ( B , w , k ) .
If j > i , v j is a descendant of v i and hence by T12, v j is to the left of all descendants of B at the same level.
Therefore, v j cannot be a node in P t ( B , w , k ) .
With similar argument, we can also prove that if v j is a node on a path from A to any l ' in y , v j cannot be a node in P t ( B , w , k ) .
This completes the proof of Proposition 2.62.
Theorem 2.63. Let P t ( S , z , h ) be a parse tree for C F G , G = ( V ,   Σ ,   R ,   S ) and P t ( A , w , k ) be the largest subtree of P t ( S , z , h ) rooted at A such that z = x w y where x , y , w , z Σ * .
If P t ( A , w , k ) is replaced by another parse tree P t ( A , w ' , k ' ) , to form a new tree P t ( S , z ' , h ' ) , then z ' = x w ' y .
Proof. 
By Proposition 2.62, we can write z ' = x ' w ' y ' for some x ' , y ' Σ * because P t ( A , w ' , k ' ) is a subtree of P t ( S , z ' , h ' ) .
See Figure 2.13 below.
Figure 2.13. Caption.
Figure 2.13. Caption.
Preprints 161810 g013
Let l be a leaf in x .
By T8, there is a unique path from S to l in P t ( S , z , h ) .
Let’s call this path ( S , v 1 , v 2 , v i v n , l ) where v i is at the same level of A .
By Proposition 2.62, ( S , v 1 , v 2 , v i v n , l ) is not affected by the removal of P t ( A , w , k ) and in addition, v i is to the left of A .
Therefore, ( S , v 1 , v 2 , v i v n , l ) remains a path in the new P t ( S , z ' , h ' ) .
Therefore, l is a leaf in z ' .
l is not in w ' because w ' consists of all the leaves created from the addition of P t ( A , w ' , k ' ) .
If l is in y ' , the ancestor of l at the level of A , namely v i , must be to the right of A which contradicts what we have shown above and that is v i is to the left of A .
Therefore, l cannot be in y ' .
Therefore, l is in x ' .
Therefore, x x ' .
Conversely, if l ' is a leaf in x ' ,
by T8, there is a unique path from S to l ' in P t ( S , z ' , h ' ) .
Let’s call this path ( S , v 1 ' , v 2 ' , v i ' v n ' , l ' ) where v i ' and A are at the same level.
By Proposition 2.62, v 1 ' , v 2 ' , v i ' v n ' , l ' are not in P t ( A , w ' , k ' ) and v i ' is to the left of A .
These nodes must have come from P t ( S , z , h ) .
In addition, they are not in P t ( A , w , k ) either because if they were, they would have been eliminated by the replacement of P t ( A , w , k ) .
Therefore, l ' is not in w .
If l ' is in y , v i ' would be to the right of A , which contradicts what we have shown above and that is v i ' is to the left of A .
Therefore, l ' cannot be in y .
Therefore, l ' is in x .
Therefore, x ' x .
Therefore, x = x ' .
With similar argument, we can also prove that y = y ' .
This completes the proof of Theorem 2.63.
Theorem 2.64A (The Pumping Lemma for 
C F L s). Let L be a C F L .
p > 0 such that if z L and z p , then
z = u v w x y for some u , v , w , x , y Σ * with the following conditions satisfied:
(i) i 0 , u v i w x i y L
(ii) v x ϵ
(iii) v w x p
Proof. 
By Theorem 2.37, there exists a C F G , G = ( V ,   Σ ,   R ,   S ) in Chomsky Normal Form such that L = L ( G ) . Let p = 2 m where m = V = the number of variables in V .
If z L , z L ( G ) .
S * z By Theorem 2.33, there is a parse tree for G with root S and yield z .
Let this parse tree be represented by P t ( S , z , h ) where h is the height of the tree.
By Theorem 2.61, z 2 h 1 .
If z p = 2 m , 2 m 2 h 1 .
m h 1 .
h m + 1 .
By T9, a path from S to a where a is a leaf in z such that the length of this path is equal to h .
(Note that this is the longest path in the tree.)
Since h m + 1 , there are at least m + 2 nodes on this path.
Let ( V 1 , V 2 , V m , V m + 1 , a ) be the lowest portion of this path where
V 1 , V 2 , V m , V m + 1 V .
See Figure 2.14 below.
Figure 2.13. Caption.
Figure 2.13. Caption.
Preprints 161810 g014
Note that this is the longest path from V 1 to a leaf.
Since m = V , by the pigeonhole principle, 1 i < j m + 1 such that V i = V j .
Let P t ( V j , w , h j ) be the largest subtree rooted at V j and P t ( V i , w ' , h i ) be the largest subtree rooted at V i .
See Figure 2.15 below.
Figure 2.13. Caption.
Figure 2.13. Caption.
Preprints 161810 g015
As can be seen in Figure 2.15, P t ( V j , w , h j ) is a subtree of P t ( V i , w ' , h i ) which in turn is a subtree of the parent tree P t ( S , z , h ) .
By Proposition 2.62, we can write the yield of P t ( V i , w ' , h i ) as v w x where v , x Σ * and the yield of P t ( S , z , h ) as u v w x y where u , y Σ * .
That is, z = u v w x y and w ' = v w x .
Since V i = V j , we can replace P t ( V j , w , h j ) by P t ( V i , v w x , h i ) to form a new parse tree.
By Theorem 2.63, the yield of this new parse tree is u v v w x x y = u v 2 w x 2 y .
By repeated application of this replacement procedure, we can create new parse trees P t ( S , u v i w x i y , k i ) for i 2 .
By Theorem 2.33, S * u v i w x i y for i 2 .
If we replace P t ( V i , v w x , h i ) by P t ( V j , w , h j ) , we obtain a new parse tree P t ( S , u w y , k 0 ) .
Again by Theorem 2.33, S * u w y .
Or S * u v 0 w x 0 y .
When i = 1 , z = u v w x y and we know S * z .
Therefore, S * u v i w x i y for i 0 .
Therefore, u v i w x i y L for i 0 .
This proves Condition (i) is satisfied.
The only three forms of rules in a C F G in C N F are:
A B C where A ϵ V and B , C ϵ V \ { S }
A a where a ϵ Σ
S ϵ where S = Start Variable
a and ϵ cannot be the children of V i because a and ϵ cannot have descendant V j .
Let B ,   C V be the two children of V i .
Let P t ( B , b , h b ) & P t ( C , c , h c ) be the largest sub parse tree with yields b , c Σ * and roots B ,   C V .
Using similar argument as used in the proof Theorem 2.33, we can show that b c = v w x .
By T7, V j is either a descendant of B or C .
If V j is a descendant of B , P t ( V j , w , h j ) is a subtree of P t ( B , b , h b ) .
By Proposition 2.62, w b and b = w 1 w w 2 for some w 1 , w 2 Σ * .
Therefore, v w x = b c = w 1 w w 2 c .
v = w 1 & x = w 2 c .
Since C and all its descendants are not S , c cannot be ϵ .
x ϵ .
Therefore, v x ϵ .
If V j is a descendant of C , with similar argument, we can show that v ϵ and hence v x ϵ .
In all cases, Condition (ii) is satisfied.
Since V i is a descendant of V 1 , P t ( V i , v w x , h i ) is a subtree of P t ( V 1 , z 1 , h 1 ) .
By Proposition 2.62, v w x z 1 .
v w x z 1 .
By Theorem 2.61, z 1 2 h 1 1 .
Therefore, v w x 2 h 1 1 .
Since ( V 1 , V 2 , V m , V m + 1 , a ) is the longest path from V 1 to a leaf,
h 1 = the length of V 1 , V 2 , V m , V m + 1 , a = m + 1 .
Therefore, v w x 2 m + 1 1 = 2 m = p .
Therefore, Condition (iii) is satisfied.
This completes the proof of Theorem 2.64A.
Theorem 2.64B Pumping Lemma (contra positive form). 
~ S L is not context free where
~ S is equivalent to:
p 1 ,   s L with s p such that whenever s = u v w x y , at least one of the conditions (i), (ii), or (iii) cannot be satisfied.
The contra positive form of the Pumping Lemma is used to prove a language is not context free. The general strategy is to find an s L with s p for any given p 1 so that whenever s is broken into s = u v w x y , at least one of the conditions of (i), (ii), or (iii) must be false. This can be usually accomplished by showing one of the following:
(1) Condition (i) alone is false.
(2) Condition (iii) ~ Condition (i)
(3) (Condition (ii) and Condition (iii)) ~ Condition (i).
Example 2.65. 
Show that L = a n b n c n | n 0 is not C F L .
p 1 , construct s = a p b p c p .
See figure below.
S = a a p b b p c c p
s L and s p .
Assume s = u v w x y .
If Condition (iii) is true, v w x p .
There are 5 cases for consideration.
(1) v w x = a n where n p (2) v w x = a n b m where n p & m p (3) v w x = b n where n p (4) v w x = b n c m where n p & m p (5) v w x = c n where n p For case (1), v w x = a n v 2 w x 2 = a n ' .
If Condition (ii) is true, v x ϵ .
Either v ϵ or x ϵ .
This means n ' > n .
s = u v w x y = u a n y contains the same number of a s , b s and c s .
u v 2 w x 2 y = u a n ' y contains more a s than s and therefore has more a s than b s and c s in itself.
Therefore, u v 2 w x 2 y is not in L .
Therefore, (Condition (ii) and Condition (iii)) ~ Condition (i).
Similar arguments can be made in cases (3) and (5) to arrive at the same conclusion as in case (1).
For case (2), s = u v w x y = u a n b m y contains the same number of a s , b s   and c s .
If Condition (ii) is true, v x ϵ .
Either v ϵ or x ϵ .
v w x = a n b m v 2 w x 2 will increase the number of a s or the number of b s or both.
u v 2 w x 2 y will have more a s than c s or more b s than c s .
Either way, u v 2 w x 2 y is not in L .
Therefore, (Condition (ii) and Condition (iii)) ~ Condition (i).
Similar arguments can be made in cases (4) to arrive at the same conclusion as in case (2).
Combining all 5 cases, we conclude (Condition (ii) and Condition (iii)) ~ Condition (i).
By Theorem 2.64B, L is not context free.
Example 2.66. 
Show that L = w # w | w 0,1 * is not C F L .
p 1 , construct s = 0 p 1 p # 0 p 1 p .
s L and s = 4 p + 1 > p .
Assume s = u v w x y .
If both Conditions (ii) & (iii) are true, we have the following cases to consider.
(1) v w x   is to the left of #
See figure below.
00 00 p 11 11 p # 00 00 p 11 11 p
v w x Condition (3) v w x p which makes it possible for v w x to be contained in 0 p 1 p .
Since Condition (ii) is true, v x ϵ .
Pumping up to u v 2 w x 2 y will increase the number of symbols on the left of the # sign while not changing the symbols on the right.
This makes it impossible for u v 2 w x 2 y to remain in L .
Therefore, (Condition (ii) and Condition (iii)) ~ Condition (i).
(2) v w x   is to the right of #
Similar argument can be made to lead to same conclusion as in (1).
(3) v w x contains the # sign
(i) w contains # (See figure below.)
00 00 p 11 11 p # 00 00 p 11 11 p
v w x Condition (iii) v w x p v   c o n t a i n s   o n l y   1 ' s   i f   v ϵ   a n d   x   c o n t a i n s   o n l y   0 ' s   i f   x ϵ .
Condition (ii) v x ϵ one of v or x is not ϵ .
Pumping down u w y = 0 p 1 i # 0 j 1 p .
v ϵ i < p 0 p 1 i # 0 j 1 p L u w y L .
x ϵ j < p 0 p 1 i # 0 j 1 p L u w y L .
Therefore, (Condition (ii) and Condition (iii)) ~ Condition (i).
(ii) w is to the left of # (See figure below.)
00 00 p 11 11 p # 00 00 p 11 11 p
v w x Since v w x contains the # sign, x ϵ .
Therefore, x contains #. (See figure above.)
Pumping down will eliminate the # sign making it impossible for u w y to remain in L .
Therefore, u w y is not in L and Condition (i) cannot be satisfied.
(iii) w is to the right of #
Similar argument will lead to the same conclusion as in case 3(ii) above.
Combining all possible cases, (Condition (ii) and Condition (iii)) ~ Condition (i).
By Theorem 2.64B, L is not context free.
Example 2.67. 
Show that the intersection of two C F L s may not be a C F L .
Let L 1 = a n b n c m | n , m N
L 2 = a n b m c m | n , m N
L 1 L 2 = a n b n c n | n N , which is not context free as shown in Example 2.65.
L 1 can be generated by the following C F G rules:
S T D
T a T b | ϵ
D D c | ϵ
L 2 can be generated by the following C F G rules:
S A B
A A a | ϵ
B b B c | ϵ Therefore, L 1 and L 2 are C F L s.
Example 2.68. 
Show that Show that L = w w | w 0,1 * is not C F L .
p 1 , construct s = 0 p 1 p 0 p 1 p .
s L and s = 4 p > p .
Assume s = u v w x y for some u , v , w , x , y 0,1 * .
Claim 1. If i < p , the strings 0 i 1 p 0 p 1 p , 0 p 1 i 0 p 1 p , 0 p 1 p 0 i 1 p , 0 p 1 p 0 p 1 i are not in L .
Assume for contradiction that 0 i 1 p 0 p 1 p L .
r 0,1 * such that 0 i 1 p 0 p 1 p = r r .
Therefore, r r = i + p + ( p + p )
r = i + p + ( p + p ) 2 Since r is the arithmetic mean of i + p and ( p + p ) and i + p < ( p + p ) ,
r > i + p and r < p + p .
Therefore, r i + p + 1 .
The leftmost i + p + 1 symbols of 0 i 1 p 0 p 1 p form the substring 0 i 1 p 0 .
The leftmost r symbols of 0 i 1 p 0 p 1 p form the substring r .
Since r i + p + 1 , 0 i 1 p 0 r .
Therefore, 10 0 i 1 p 0 r .
Similarly, the rightmost 2 p symbols of 0 i 1 p 0 p 1 p form the substring 0 p 1 p and the rightmost r symbols of 0 i 1 p 0 p 1 p form the substring r .
Since r < p + p , r 0 p 1 p .
Therefore, 10 r 0 p 1 p .
This is a contradiction because 10 cannot be a substring of 0 p 1 p .
Therefore, 0 i 1 p 0 p 1 p L .
Similar arguments can be made to show 0 p 1 i 0 p 1 p , 0 p 1 p 0 i 1 p , 0 p 1 p 0 p 1 i are not in L .
Claim 2.
If at least one of i , j is less than p , the strings 0 i 1 j 0 p 1 p , 0 p 1 i 0 j 1 p , 0 p 1 p 0 i 1 j are not in L .
Assume for contradiction 0 i 1 j 0 p 1 p is in L .
0 i 1 j 0 p 1 p = r r for some r 0,1 * .
Therefore, r r = i + j + ( p + p )
r = i + j + ( p + p ) 2 Since r is the arithmetic mean of i + j and ( p + p ) and i + j < ( p + p ) ,
r > i + j and r < p + p .
Therefore, r i + j + 1 .
The leftmost i + j + 1 symbols of 0 i 1 j 0 p 1 p form the substring 0 i 1 j 0 .
The leftmost r symbols of 0 i 1 j 0 p 1 p form the substring r .
Since r i + j + 1 , 0 i 1 j 0 r .
Therefore, 10 0 i 1 j 0 r .
Similarly, the rightmost 2 p symbols of 0 i 1 j 0 p 1 p form the substring 0 p 1 p and the rightmost r symbols of 0 i 1 j 0 p 1 p form the substring r .
Since r < p + p , r 0 p 1 p .
Since 10 r & r 0 p 1 p , 10 0 p 1 p .
This is a contradiction because 10 cannot be a substring of 0 p 1 p .
Therefore, 0 i 1 j 0 p 1 p L .
Similar arguments can be made to show that 0 p 1 i 0 j 1 p , 0 p 1 p 0 i 1 j are not in L .
Returning to the proof that L is not C F L , we assume both Condition (ii) and Condition (iii) are true.
Since v w x p , we have 7 cases to consider.
(1) v w x is a substring of the first block of 0 p .
00 00 p 11 11 p 00 00 p 11 11 p
v w x (2) v w x is a substring of the first block of 1 p .
00 00 p 11 11 p 00 00 p 11 11 p
v w x (3) v w x is a substring of the second block of 0 p .
00 00 p 11 11 p 00 00 p 11 11 p
v w x (4) v w x is a substring of the second block of 1 p .
00 00 p 11 11 p 00 00 p 11 11 p
v w x (5) v w x straddles the first block of 0 p and the first block of 1 p .
00 00 p 11 11 p 00 00 p 11 11 p
v w x (6) v w x straddles the first block of 1 p and the second block of 0 p .
00 00 p 11 11 p 00 00 p 11 11 p
v w x (7) v w x straddles the second block of 0 p and the second block of 1 p .
00 00 p 11 11 p 00 00 p 11 11 p
                              v w x
In case (1), v consists of all 0 ’s if v ϵ and x consists of all 0 ’s if x ϵ .
Pumping down would only affect the first block of 0 p and not the other 3 blocks.
Therefore, u w y = 0 i 1 p 0 p 1 p .
Since v x ϵ by Condition (ii), one of v and x is not ϵ .
Pumping down would reduce the number of 0 ’s in the first block of 0 p .
Therefore, i < p .
By Claim 1, u v 0 w x 0 y = u w y = 0 i 1 p 0 p 1 p L .
Therefore, Condition (i) is not satisfied.
For cases (2), (3) and (4), similar arguments can be made to lead to the same conclusion as in (1).
For case (5), v w x p   pumping down can only affect the first and second blocks of symbols.
We can write u w y = 0 i 1 j 0 p 1 p .
Furthermore, the first symbol of v w x is 0 and the last symbol of v w x is 1 .
If v ϵ , the first symbol of v is 0 .
If x ϵ , the last symbol of x is 1 .
Since Condition (ii) is true, v x ϵ .
One of v and x is not ϵ .
Pumping down will either reduce the number of 0 ’s in 0 p or the number of 1 ’s in 1 p .
Therefore, either i < p or j < p .
By Claim 2, u w y = 0 i 1 j 0 p 1 p is not in L .
Therefore, Condition (i) is not satisfied.
For cases (6) and (7), similar arguments can be made to lead to the same conclusion as in (5).
Combining all 7 cases, we conclude that
(Condition (ii) and Condition (iii)) ~ Condition (i).
Hence, by Theorem 2.64B, L is not context free.

References

  1. Sipser, Michael. Introduction to the Theory of Computation, Third Edition.
  2. Dexter C. Kozen. Automata & Computability.
  3. John E. Hopcroft, Rajeev Motwani, Jeffrey D Ullman. Introduction to Automata Theory, Languages, & Computation, Third Edition.
  4. Seymour Lipschutz, Marc Lars Lipson. Discrete Mathematics, Second Edition.
  5. Kwan, Chac. A Mathematical Approach to the Theory of Finite Automata, https://figshare.com/articles/journal_contribution/A_Mathematical_Approach_to_the_Theory_of_Finite_Automata_pdf/26232644?file=47541602. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated