Context-Free Languages

Closure Properties


Context-Free Languages are closed under union


  Let G_1 and G_2 be CFGs with start symbol S_1 and S_2, introduce a new CFG G_3 with start symbol S and a production rule S\rightarrow S_1|S_2, and L(G_3) will be the same as L(G_1)\cup L(G_2), since it is obvious that S can only generates S_1 or S_2 which will generate L(G_1) and L(G_2) respectively.
  The other way is to construct a union machine for L(G_1) and L(G_2) like what we did for those regular languages, use a new start state and link it to the original machines start state.


Context-Free Languages are closed under production


  Let G_1 and G_2 be CFGs with start symbol S_1 and S_2, introduce a new CFG G_3 with start symbol S and a production rule S\rightarrow S_1S_2, and L(G_3) will be the same as L(G_1)L(G_2), because it is obvious that the first part of L(G_3) will be in L(G_1) and the second part will comes from L(G_2)

Kleene Closure

If L(G) is a CFL, then so is L(G)^\ast


  Let G be the CFG of the L(G) with start symbol S_1, introduce a new CFG G' incorporated with G and a production S\rightarrow S_1S|\Lambda.
  It is clear that S\stackrel{\ast}{\Rightarrow}S_1^n


The intersection of two CFLs may not be CFL


  Since all regular languages are context free languages and we've give a sufficient proof that the intersection of regular languages are still regular languages, so if two langauges are regular, then then're also context-free, hence their intersection is regular and context
  If a CFL is not regular, this statement may not hold, for example, the language L_1=\lbrace a^nb^na^m\rbrace is context-free(the proof is omitted), and so does L_2=\lbrace a^nb^mc^m\rbrace, but there intersection, which is L_3=\lbrace a^nb^na^n\rbrace is apparently not context-free since we can disprove it by using the pumping lemma for context-free languages we've mentioned at last chapter.


The complement of two CFLs may not be CFL


  As we've stated in the last theorem's proof, part of the complement languages are context-free because they're also regular, and the others are not
  Let L_1 and L_2 be context-free languages, if their complement \overline{L_1} and \overline{L_2} are context-free, then \overline{L_1}\cup\overline{L_2} must be context-free according to the union property, and \overline{\overline{L_1}\cup\overline{L_2}} must be also context-free, but from the De Morgan law we know \overline{\overline{L_1}\cup\overline{L_2}}=L_1\cap L_2 which we've just proved that may not be context-free, so we proved this theorem by contradiction.

Mixing with Regular Languages

The intersection of a context-free language and a regular language is always context-free


  We can build a PDA P_M upon the PDA P of the CFL and the FA M of the regular language, here are steps:

  1. Start from the start state of both the machines, label each state of the P with the a state of M that the input string will be in if it were running on M, so the \tt START state of P will be labeled with the start state in M, say, x_0
  2. If we encounter a \tt PUSH or \tt POP state, they do not consume any letters so the state of the M will not change, we label them with the state of M which stays intact
  3. If we encounter a \tt READ state, although it consumes a letter, but this state will still be labeled with the old state in M, but as we leave this state by following the outgoing edges, the state in M will change according to the label on the outgoing edges of \tt READ. If the \tt READ is nondeterministic, label each of the states that are connected to \tt READ by outgoing edges with the same state in M
  4. If a state has both a and b incoming edges, duplicate this state and let both of them have the same outgoing edges, but different incoming edges, one of which's incoming edge will be labled with a and the other one will be labeled b

This algorithm must be finite because it can be described in another way, let the states in S(P) become \lbrace y_0, y_1, y_2,...y_n\rbrace and states in S(M) become \lbrace x_0, x_1,x_2,...,x_3\rbrace, then let S(P_M)=S(P)\times S(M), for every y and x find a corresponding element in S(P_M) which satisfies the above rules(it will be a set of ordered pairs), this procedure is obviously finite since both the S(M) and S(P) are finite sets.
The last step of the algorithm is make sure that the running string terminates if and only if it reaches a state such that the y part of this state is an \tt ACCEPT state in P and the x part of this state if a halt state in M, now we've built a PDA that accept both the context-free language and the regular language. Q.E.D.