Selected Papers on Computer Languages

by Donald E. Knuth (Stanford, California: Center for the Study of Language and Information, 2003), xvi+594pp.
(CSLI Lecture Notes, no. 139.)
ISBN 1575863812 (cloth), 1575863820 (paperback)

This is the fifth in a series of eight volumes that contain archival forms of my published papers, together with new material. (The first book in the series was Literate Programming; the second was Selected Papers on Computer Science; the third was Digital Typography; the fourth was Selected Papers on Analysis of Algorithms.) The Computer Languages volume is characterized by the following remarks quoted from its preface.

This book is devoted to the topic that got me hooked on computers in the first place: the languages by which people are able to communicate with machines.
Rules of grammar have always fascinated me. In retrospect I think that the most exciting days of my elementary school education came during the seventh grade, when I was taught how to make diagrams of English-language sentences in order to see how the various parts of speech fit together. My friends and I would stay after class, trying to put more and more sentences into diagrammatic form---and getting totally confused by poetry, which seemed to break most of the rules.
During the summer of 1957, after I had just written my first computer programs for numerical problems like factoring numbers and for fun problems like playing tic-tac-toe, an amazing program called IT---the “Internal Translator”---arrived at the Case Tech Computing Center, where I had been working as a college freshman. IT would look at a card that contained an algebraic equation, then the machine would flash its lights for a few seconds and go punch-punch-punch; out would come a deck of cards containing a machine-language program to compute the value of the equation. Mystified by the fact that a mere computer could understand algebra well enough to create its own programs, I got hold of the source code for IT and spent two weeks poring over the listing. Lo, the secrets were revealed, and several friends helped me in 1958 to extend IT even further.
Of course in those days we had only a vague notion of what we were doing; our work was almost totally disorganized, with very few principles to guide us. But researchers in linguistics were beginning to formulate rules of grammar that were considerably more mathematical than before. And people began to realize that such methods are highly relevant to the artificial languages that were becoming popular for computer programming, even though natural languages like English remained intractable.
I found the mathematical approach to grammar immediately appealing---so much so, in fact, that I must admit to taking a copy of Noam Chomsky’s Syntactic Structures along with me on my honeymoon in 1961. During odd moments, while crossing the Atlantic in an ocean liner and while camping in Europe, I read that book rather thoroughly and tried to answer some basic theoretical questions. Here was a marvelous thing: a mathematical theory of language in which I could use a computer programmer’s intuition! The mathematical, linguistic, and algorithmic parts of my life had previously been totally separate. During the ensuing years those three aspects became steadily more intertwined; and by the end of the 1960s I found myself a Professor of Computer Science at Stanford University, primarily because of work that I had done with respect to languages for computer programming.

... And then the preface continues by describing the various chapters, which can be summarized more succinctly by quoting from the hype on the back cover:

Two dozen of Knuth’s classic papers on the subject are collected in this volume, brought up to date with supplementary material, and augmented by a previously unpublished essay on language design. Of particular interest are his fascinating and definitive survey of the twenty languages for programming that preceded FORTRAN I, along with three of his fundamental papers that each launched significant subfields of computer science: (1) The theories of LL(k) and LR(k) parsing; (2) attribute grammars to define the meaning of languages; (3) empirical studies of user programs and profile-based optimization. Every chapter is self-contained and accessible to computer programmers with varied backgrounds. Readers will be able to participate vicariously in the creation of the concepts that have now become thoroughly integrated into modern software systems.

Here’s the table of contents:

  1. The early development of programming languages [P83]
  2. Backus Normal Form versus Backus Naur Form [Q11]
  3. Teaching ALGOL 60 [Q13]
  4. ALGOL 60 Confidential [P6]
  5. SMALGOL-61 [P7]
  6. Man or boy? [Q12]
  7. A proposal for input-output conventions in ALGOL 60 [P15]
  8. The remaining trouble spots in ALGOL 60 [P29]
  9. SOL--A symbolic language for general purpose systems simulation [P16]
  10. A formal definition of SOL [P17]
  11. The science of programming languages [excerpts from the draft of an incomplete paper, vintage 1965]
  12. Programming languages for automata [P28]
  13. A characterization of parenthesis languages [P30]
  14. Top-down syntax analysis [P48]
  15. On the translation of languages from left to right [P23]
  16. Context-free multilanguages [P139]
  17. Semantics of context-free languages [P32]
  18. Examples of formal semantics [P45]
  19. The genesis of attribute grammars [Q114]
  20. A history of writing compilers [P10]
  21. Runcible---Algebraic translation on a limited computer [P2]
  22. Computer-drawn flow charts [P11]
  23. Notes on avoiding `go to' statements [P42]
  24. An empirical study of FORTRAN programs [P47]
  25. Efficient coroutine generation of constrained Gray sequences [P160]

(Numbers like P83 and Q11 in this list refer to the corresponding papers in my list of publications.)

This is a useful historical document, as much for the selection being made by Knuth, as for many of the papers being hard to find or just unobtainable. ... From an intellectual point of view, it is a pleasure to have ready access to the founding paper for the subject of LR(k) parsing and a paper that has a similar position with respect to attribute grammars. From an historical perspective, it is also a pleasure to explore the activity around the creation of Algol, its precursors and successors. ... Two chapters are of particular note among the rest. One is Chapter 24, presenting probably one of the earliest detailed performance studies. ... The other, Chapter 25, is relatively recent, was written for Ole-Johan Dahl’s 70th birthday and is a delightful reminder of the elegance and effectiveness of the coroutine as a control abstraction. All in all, a nice addition to the library shelf and a useful source for those who wish to avoid re-inventing wheels. -- Julian Padget, Math Reviews (2004)
The reader will be able to see the early history and development of fundamental concepts that have now become thoroughly integrated into modern software systems. -- G. Grigas, Zentralblatt MATH 1046 (2004)

This book can be ordered from the publisher (CSLI), and also from the distributor (University of Chicago Press).

Here’s the book talk at the Computer History Museum that I gave in 2003 when this book was brand new. (The actual title af that lecture was “A dozen precursors of FORTRAN” but I started by discussing the book as a whole.)

Errata

As usual, I promise to deposit 0x$1.00 ($2.56) to the account of the first person who finds and reports anything that remains technically, historically, typographically, or politically incorrect.

Here is a list of all nits that have been picked so far in the first printing (2003); an asterisk (*) marks technical errors that are not merely typographical:

page xv, line 7 (25 Apr 2004)
change "7," to "7"
page xv, line 14 from the bottom (25 Apr 2004)
change "general purpose" to "general-purpose"
page xv, line 9 from the bottom (25 Apr 2004)
change "EC--13" to "EC-13"
page xvi, line 16 from the bottom (25 Apr 2004)
change "flow charts" to "flowcharts"
page xvi, line 2 from the bottom (03 June 2004)
change "(2003). Copyright (c)2003" to "(2004), pp. 183--208. Copyright (c)2004"
page 3, line 02 of the program (12 Feb 2014)
change "real t; value t;" to "value t; real t;"
page 11, line 23 from the bottom (10 Apr 2011)
change "$R_0$ is of type" to "the components of $R_0$ have type"
page 22, line 10 from the bottom (27 Dec 2009)
change "1,35" to "1, 35"
page 24, line 2 from the bottom (27 Dec 2009)
change "Zurich" to "Zürich"
page 26, line 2 (04 Jan 2010)
change "Sgn(x)" to "Sgn(x)"
page 26, line 6 (20 Jan 2022)
change "04" to "05"
page 27, line 12 from the bottom (27 Dec 2009)
change "i," to "i"
page 27, line 11 from the bottom (27 Dec 2009)
change the second "Op," to "Op"
page 28, line 9 (29 May 2004)
change "h," to "h"
page 37, line 7 from the bottom (22 Sep 2022)
change 'positive' to 'nonnegative'
page 46, lines 10 thru 15 of the code (22 Sep 2022)
they should be changed slightly and reordered:
10   $e=400-y$,
11   CP 3,
12   PRINT $i,y$,
13   SP 4,
14 3 $z=999$,
15   PRINT $i,z$,
page 62, box 3 (03 August 2016)
change $2^{+13}$ to $2^{+23}$
pages 62 and 63, bottom line (03 August 2016)
change +13 to +23
page 63, line 6 from the bottom (03 August 2016)
change $2^{13}$ to $2^{23}$
page 72, line (2) of the code (22 Sep 2022)
change '10(-1)0' to '10 (-1) 0'
page 76, new entry for Table 1 following Brooker’s AUTOCODE (06 February 2009)
Transcode / Hume & Worsley / 1954 / F / A / D / D / C / C / C / Separate subroutine modules
page 90, line 20 (program line TPK 33) (03 August 2016)
change II to I
*page 91, beginning of the Addendum (12 February 2023)
Change "Another language ... this history." to: "Two other languages, Transcode and PACT I, deserve to be part of the story as well, so they appear in Table 1 above although they were unfortunately missed by the authors when we first compiled this history."
[Now include a 1.5-page detailed discussion of Transcode; this inserted material causes the index to change rather drastically, with many changes not documented below.]
original page 92, line 9 from the bottom (27 Dec 2009)
change "Jr., and Bruce" to "Jr. and Bruce"
page 115, lines 14 and 19 (29 May 2004)
change "ALGOL" to "ALGOL"
page 123, line 4 (16 Feb 2007)
change "1964) 7" to "1964), 7"
page 130, line 6 after Figure 1
change "spaces; that" to "spaces; thus"
page 178, line 4 from the bottom (03 August 2016)
change "three transactions: one for each of the programs (1), (2), (3)" to "four transactions: one for each of the programs (1), (2), (3), plus a control routine that will be described later"
page 200, line 1 (12 June 2004)
change "Statement" to "Statements"
page 212, line 16 (12 June 2004)
change "4H" to "4H    ." (the printer thought this period was a speck of dust and cleaned it off)
*page 213, line 21 (03 August 2016)
change SET C(3)TYPE 1. to GOTO 100.
page 214, line 2 (03 August 2016)
change SORCE.ET. to SORCE.EQUALS.
page 214, line 22 (03 August 2016)
change SORCE .ET. to SORCE .EQUALS.
page 218, line 12 (31 May 2005)
change "according the" to "according to the"
page 224, line 9 (03 August 2016)
change "i"≥"e" to "i">"e"
page 225, line 5 (03 August 2016)
change "variable string" to "string variable"
*page 250, line 2 (05 July 2004)
change "$q_\infty$" to "$q_f, q_\infty$"
page 266, line 17 from the bottom (05 July 2004)
change "e)" to "e"
*page 315, line 10 from the bottom (09 February 2008)
change "{\sl if the following}" to "{\sl if $S\rightarrow^+S$ is false and if the following}"
page 326, line 5 (05 July 2004)
change "60," to "60,''"
*page 329, line 10 (09 February 2008)
add a new sentence: "We shall assume that `$S$' itself is not a sentential form."
*page 330, line 7 (27 January 2022)
change "Here $r$ might be equal to $n$" to "In particular, $r+1=n$"
page 334, line 7 (03 August 2016)
change "where $\alpha$" to "where $A$ is in $N$ and $\alpha$"
page 335, line 13 (03 August 2016)
change "tree (3)." to "tree (3). (Grammar (15) doesn't produce any terminal strings, because [1], ..., [6] are never on the left. But that’s fine: We use (15) only to test (14).)
page 336, line 2 (09 February 2008)
change "above" to "above, together with $S{\dashv}^k$"
*page 337, line 2 from the bottom (09 February 2008)
change "now equals" to "now contains"
page 351, line 3 (03 August 2016)
change "accepted;" to "accepted and followed by a rule of type (ii) or (iii);"
page 359, in reference [5] (02 August 2009)
change "62--66" to "62--67"
page 373, line 4 (03 August 2016)
change $g(q_{j-1},q_j)$ to $g(q_{j-1},a_j)$
page 381, line 1 (03 August 2016)
change "ancestors" to "ancestors and siblings"
page 382, line 23 (03 August 2016)
change "possible" to "potential"
page 383, line after (2.2) (03 August 2016)
change "(see (1.2))" to "(see (1.2) and (2.1))"
page 391, line 9 from the bottom (03 August 2016)
change "of (2.3)" to "of (2.4)"
page 393, line 22 from the bottom (03 August 2016)
change "symbol(text(I))" to "{symbol(text(I))}"
page 393, line 3 from the bottom (29 May 2013)
change "marilyn" to "marilyn,"
page 405, line 21 from the bottom (03 August 2016)
change "substf($E_2$) = subst($E_1$)" to "substf($E_2$) = substf($E_1$)"
page 416, line 11 from the bottom (03 August 2016)
change "See Table 7." to "See Table 7. For simplicity, the instruction numbers $\nu$ in $(\nu:C)$ are assumed to be consecutive."
page 416, replacement for the paragraph before section 4 (03 August 2016)
"Notice that the rules in Table 7 specify a multipass process in a compact, “declarative” manner: First we find msym, hence b; then we count bits and addrs, finding a; then we can evaluate each length(C), fix each loc(ν), and fill in the addresses."
page 417, line 4 (03 August 2016)
change $(k-1)\cdot\rm addrs$ to $k\cdot\rm addrs$
page 418, line 17 (14 November 2016)
change "contains the value $\eta_j$" to "will contain the value $\eta_j$, once $\eta_j$ is known"
page 429, lines 12 and 13 from the bottom (05 May 2011)
change "... my" to "About two weeks ago my", and change "five weeks" to "five days"
page 455, line 17 (04 May 2021)
change "for you separate" to "for you to separate"
page 504, replacement for lines 8 and 9 (23 February 2022)
exit loop if above the root;
Detach your left child from the tree;
page 505, bottom line (02 March 2010)
change "89.]" to "89."
page 513, line 8 from the bottom (03 August 2016)
the correctly rounded percentages are: 68.1 17.6 1.3 0.1 0.3 2.9 2.4 0.7 2.8 0.7
page 514, lines 10 and 11 (03 August 2016)
the correctly rounded percentages are: 58.3 30.6 9.7 1.2
page 530, line 18 (03 August 2016)
change X(I).LT.Q to X(I,J).LT.Q
page 532, replacement for bottom three lines (23 February 2022)
  BL L1$\alpha\gamma\beta$ to L1$\alpha\gamma\beta$ if less.
  BE L4$\gamma$ to L4$\gamma$ if equal.
L1$\gamma\beta\alpha$ ...
page 536, line 4 (05 July 2004)
change "4.8 2.7" to "4.6 2.6"
page 536, line 8 (05 July 2004)
change "9.0" to "8.6"
page 536, line 9 (05 July 2004)
change "2.0 2.1 5.4" to "2.1 2.1 5.3"
page 536, line 15 (05 July 2004)
change "1.1 1.1" to "1.2 1.1"
page 536, line 17 (05 July 2004)
change "1.6" to "1.7"
page 536, line 18 (05 July 2004)
change "1.1" to "1.2"
page 537, line 11 from the bottom (31 May 2005)
change "elevenfold" to "eleven-fold"
page 540, line 21 (23 February 2022)
change "2r5" to "r5"
page 545, line 16 (03 June 2004)
change "2003)." to "2004), 183--208."
page 572, line 17 from the bottom (23 February 2022)
change "Section 2" to "Section 1"
page 574, line 5 from the bottom (26 June 2017)
change "method for" to "mechanism for"
pages 576 and 579 in entries for de Bakker
change "Wilhelm" to "Willem"
page 579, left column (28 August 2003)
change "Čulik" to "Čulík" (twice)
page 579, right column, Dijkstra entry (28 Mar 2005)
change "Wijbe" to "Wybe"
page 580, left column, new entry (08 July 2005)
Empty string, 300, 357, 367, 374.
page 580, left column (11 November 2003)
change "Elsworth, A. Kenton" to "Elsworth, Adolph Kenton"
page 582, left column, Gordon entry (29 October 2020)
change "Goeffrey" to "Geoffrey"
page 582, left column, Greibach entry (14 October 2009)
change "Shiela" to "Sheila"
page 582, right column, Haynam entry (16 October 2009)
change "E." to "Elmer"
page 582, right column, new entry (06 February 2009)
Hume, James Nairn Patterson, 76, 91--92.
page 584, the entry for Knuth, Donald (02 August 2009)
change "xvi" to "xiv"
page 585, right column (08 August 2012)
change "Luke, Richard C." to "Luke, Richard Carlen"
page 586, left column (08 February 2008)
change "Metropolis, Nicolas" to "Metropolis, Nicholas"
page 588, right column (15 August 2003)
change "Price, Alan G." to "Price, Alan George"
page 589, left column (19 March 2011)
change "pushdown automation" to "pushdown automaton"
page 591, left column (06 August 2007)
change "Speroni, Joseph P." to "Speroni, Joseph Paul"
page 593, right column (15 August 2006)
change "Warren, Don W." to "Warren, Don Wyman"
page 594, left column, new entry (06 February 2009)
Worsley, Beatrice Helen, 76, 91--92.
page 594, Zuse entry (23 November 2023)
change "Konrad" to "Konrad Ernst Otto"

I hope the book is otherwise error-free; but (sigh) it probably isn't, because each page presented me with hundreds of opportunities to make mistakes. Please send suggested corrections to knuth-bug@cs.stanford.edu, or send snail mail to Prof. D. Knuth, Computer Science Department, Gates Building 1B, Stanford University, Stanford, CA 94305-9015 USA. In either case please include your postal address, so that I can mail an official certificate of deposit as a token of thanks for any improvements to which you have contributed.

I may not be able to read your message until many months have gone by, because I'm working intensively on The Art of Computer Programming. However, I promise to reply in due time.

DO NOT SEND EMAIL TO KNUTH-BUG EXCEPT TO REPORT ERRORS IN BOOKS! And if you do report an error via email, please do not include attachments of any kind; your message should be readable on brand-X operating systems for all values of X.

Don Knuth’s home page

Don Knuth’s other books

Valid HTML 4.01 Transitional