If L is an infinite regular language, then there exists some positive integer m such that any string w L whose length is m or greater can be decomposed into three parts, xyz, where

- |xy| is less than or equal to m,
- |y| > 0,
- w
_{i}= xy^{i}z is also in L for all i = 0, 1, 2, 3, ....

- m is a (finite) number chosen so that strings of length m
or greater
*must*contain a cycle. Hence, m must be equal to or greater than the number of states in the dfa. Remember that we*don't know the dfa,*so we can't actually choose m; we just know that such an m must exist. - Since string w has length greater than or equal to m, we can break it into two parts, xy and z, such that xy must contain a cycle. We don't know the dfa, so we don't know exactly where to make this break, but we know that |xy| can be less than or equal to m.
- We let x be the part before the cycle, y be the cycle, and z the part after the cycle. (It is possible that x and z contain cycles, but we don't care about that.) Again, we don't know exactly where to make this break.
- Since y is the cycle we are interested in, we must have |y| > 0, otherwise it isn't a cycle.
- By repeating y an arbitrary number of times, xy*z, we must get other strings in L.
- If, despite all the above uncertainties, we can show that the dfa has to accept some string that we know is not in the language, then we can conclude that the language is not regular.

- For
*any*choice of m, - for some w L that we get to choose (and we will choose one of length at least m),
- for
*any*way of decomposing w into xyz, so long as |xy| isn't greater than m and y isn't , - we can choose an i such that xy
^{i}z is not in L.

Copyright © 1996 by David Matuszek

Last modified Feb 18, 1996