[Overview] [Previous] [Next]

# Proving the Pumping Lemma, Part I

Suppose we have a context-free language L. Then there is some context-free grammar G that generates L. Further suppose
• L is infinite, hence there is no upper bound on the length of strings belonging to L.
• L does not contain .
• G has no unit productions or productions.

There are only a finite number of variables in a grammar, and the productions for each variable have finite lengths. The only way that a grammar can generate arbitrarily long strings is if one or more variables is both useful and recursive.

If a variable is not useful, it does not occur in the derivation of any string of the language. Useless variables can always be eliminated from a grammar.

Suppose that no variable in the grammar is recursive. Since the start symbol is nonrecursive, it must be defined only in terms of terminals and other variables. Then since those variables are nonrecursive, they have to be defined in terms of terminals and still other variables, and so on. After a while we run out of "other variables" while the generated string is still finite. Hence there is an upper bound on the length of the string that can be generated from the start symbol. This contradicts our premise that the language is finite. Therefore, our assumption than no variable is recursive must be incorrect.