More Shell, Less Egg
Jon Bentley had a regular column called “Programming Pearls” in the Communications of the ACM (you may have come across [the book that collects] some of his columns). In 1986 he got interested in literate programming and asked Donald Knuth to write a program in that style as a guest column and Doug McIlroy to write a literary-style critique of it.
Literate programming is an interesting topic in its own right. The idea, which originated with Knuth, is to write a program and its documentation at the same time, interleaved with each other. It’s not just writing good comments or including docstrings or using systems like POD. In literate programming, the code is subservient to the documentation.
The program Bentley asked Knuth to write is one that’s become familiar to people who use languages with serious text-handling capabilities: Read a file of text, determine the n most frequently used words, and print out a sorted list of those words along with their frequencies.
Knuth wrote his program in WEB, a literate programming system of his own devising that used Pascal as its programming language.
McIlroy’s review started with an appreciation of Knuth’s presentation and the literate programming technique in general.
And then he calmly and clearly eviscerated the very foundation of Knuth’s program.
What people remember about his review is that McIlroy wrote a six-command shell pipeline that was a complete (and bug-free) replacement for Knuth’s 10+ pages of Pascal.
And additionally, McIlroy wrote 6 sentences of comments, one for each command’s role in the pipeline.
What’s often overlooked when this review is discussed is McIlroy’s explanation of why his solution is better–and it’s not just because it’s shorter.
A wise engineering solution would produce–or better, exploit–reusable parts.
To return to Knuth’s paper: Everything there–even input conversion and sorting–is programmed monolithically and from scratch. […] Even if […] programs for these exact purposes were not at hand, these operations would well be implemented separately: for separation of concerns, for easier development, for piecewise debugging, and for potential reuse.
The simple pipeline given above will suffice to get answers right now, not next week or next month. It could well be enough to finish the job. But even for a production project, say for the Library of Congress, it would make a handsome down payment, useful for testing the value of the answers and for smoking out follow-on questions.
Knuth has shown us here how to program intelligibly, but not wisely. I buy the discipline. I do not buy the result. He has fashioned a sort of industrial-strength FabergĂ© egg–intricate, wonderfully worked, refined beyond all ordinary desires, a museum piece from the start.
Just remember, he’s saying this about Donald Knuth.
No Fabergé eggs for McIlroy. Just brass balls.