Jeffrey Shallit.
A second course in formal languages and automata theory
Cambridge University Press, 2008, 260 pp.
ACM CR categories: F.4.3 Classes defined by grammars or automata; F.1.1 Models of computation; F.4.3 Formal languages; F.1.1 Automata

In the preface, the author quotes a professor who said that in its attempt to target undergraduate students, the second edition of Hopcroft and Ullman’s classic [1] “removed all the good parts.” Another course Shallit has been teaching is aimed at graduate and senior undergraduate students.

The first chapter is a brief review of what the reader is expected to have mastered in a first course. Shallit has some charming examples; I particularly liked the ones for perfect shuffles that seem to be drawn from an experience with crossword puzzles.

The second chapter deals with combinatorics on words, such as the famous Thue-Morse sequence, on which Shallit himself and Jean-Paul Allouche, his coauthor on another book [2], did some work. I would prefer dealing with this material toward the end of a course rather than the beginning.

The next chapter is on finite automata and regular languages. While many interesting topics are covered—and I really enjoyed the prime number game—I failed to see a unifying theme. This raises the rather difficult question: What should one expect as the outcome of a second course such as the one in this book?

Pippenger’s book [3] can also be seen as a second course that aims to explain computation in a more abstract way. Shallit adopts a different tactic—he sticks to the elementary combinatorial style of Hopcroft and Ullman, avoids bringing in aspects such as algebra, and concentrates more on formal languages. For the most part, this book stays well within the ambit of Hopcroft and Ullman.

The next two chapters, on context-free languages and parsing, work well together. There is a nice theorem that one of the Myhill-Nerode equivalence classes of a deterministic context-free language (DCFL) must have infinitely many elements; this was new to me and is perhaps due to the author, since no precise reference is given.

The following chapter, on computability, has new material for a graduate course, such as Kolmogorov complexity and the busy beaver, along with more standard fare, such as the Post correspondence problem and the completeness of nonuniversality of regular languages for polynomial space. Again, it is an assortment of topics that does not quite merit a Turing machines chapter. PSPACE-completeness could have been moved to the chapter on regular languages.

Similar to Pippenger’s book, the last chapter has a welcome treatment of advanced topics, such as the Immerman-Szelepcsényi theorem on closure of context-sensitive languages under complement and Cook’s linear time algorithm for membership of a language accepted by a two-way deterministic pushdown automaton (2DPDA).

I enjoyed reading this book and intend to try out some of the material in a graduate course. The exercises, projects, and research problems at the end of each chapter are valuable and should be a major reason for an instructor to pick up a copy of this book. However, I prefer Pippenger’s book for a second course, because its goal is more interesting than filling in for Hopcroft and Ullman.

1) Hopcroft, J.E.; Ullman, J.D. Introduction to automata theory, languages, and computation. Addison-Wesley, Boston, MA, 1979.

2) Allouche, J.-P.; Shallit, J. Automatic sequences: theory, applications, generalizations. Cambridge University Press, New York, NY, 2003.

3) Pippenger, N. Theories of computability. Cambridge University Press, New York, NY, 1997.