20. The Special Operators - Local Flow of Control - 《Practical Common Lisp》

Local Flow of Control

Local Flow of Control

The next four special operators I’ll discuss also create and use names in the lexical environment but for the purposes of altering the flow of control rather than defining new functions and macros. I’ve mentioned all four of these special operators in passing because they provide the underlying mechanisms used by other language features. They’re **BLOCK**, **RETURN-FROM**, **TAGBODY**, and **GO**. The first two, **BLOCK** and **RETURN-FROM**, are used together to write code that returns immediately from a section of code—I discussed **RETURN-FROM** in Chapter 5 as a way to return immediately from a function, but it’s more general than that. The other two, **TAGBODY** and **GO**, provide a quite low-level goto construct that’s the basis for all the higher-level looping constructs you’ve already seen.

The basic skeleton of a **BLOCK** form is this:

(block name
  form*)

The name is a symbol, and the forms are Lisp forms. The forms are evaluated in order, and the value of the last form is returned as the value of the **BLOCK** unless a **RETURN-FROM** is used to return from the block early. A **RETURN-FROM** form, as you saw in Chapter 5, consists of the name of the block to return from and, optionally, a form that provides a value to return. When a **RETURN-FROM** is evaluated, it causes the named **BLOCK** to return immediately. If **RETURN-FROM** is called with a return value form, the **BLOCK** will return the resulting value; otherwise, the **BLOCK** evaluates to **NIL**.

A **BLOCK** name can be any symbol, which includes **NIL**. Many of the standard control construct macros, such as **DO**, **DOTIMES**, and **DOLIST**, generate an expansion consisting of a **BLOCK** named **NIL**. This allows you to use the **RETURN** macro, which is a bit of syntactic sugar for (return-from nil ...), to break out of such loops. Thus, the following loop will print at most ten random numbers, stopping as soon as it gets a number greater than 50:

(dotimes (i 10)
  (let ((answer (random 100)))
    (print answer)
    (if (> answer 50) (return))))

Function-defining macros such as **DEFUN**, **FLET**, and **LABELS**, on the other hand, wrap their bodies in a **BLOCK** with the same name as the function. That’s why you can use **RETURN-FROM** to return from a function.

**TAGBODY** and **GO** have a similar relationship to each other as **BLOCK** and **RETURN-FROM**: a **TAGBODY** form defines a context in which names are defined that can be used by **GO**. The skeleton of a **TAGBODY** is as follows:

(tagbody
  tag-or-compound-form*)

where each tag-or-compound-form is either a symbol, called a tag, or a nonempty list form. The list forms are evaluated in order and the tags ignored, except as I’ll discuss in a moment. After the last form of the **TAGBODY** is evaluated, the **TAGBODY** returns **NIL**. Anywhere within the lexical scope of the **TAGBODY** you can use the **GO** special operator to jump immediately to any of the tags, and evaluation will resume with the form following the tag. For instance, you can write a trivial infinite loop with **TAGBODY** and **GO** like this:

(tagbody
 top
   (print 'hello)
   (go top))

Note that while the tag names must appear at the top level of the **TAGBODY**, not nested within other forms, the **GO** special operator can appear anywhere within the scope of the **TAGBODY**. This means you could write a loop that loops a random number of times like this:

(tagbody
 top
   (print 'hello)
   (when (plusp (random 10)) (go top)))

An even sillier example of **TAGBODY**, which shows you can have multiple tags in a single **TAGBODY**, looks like this:

(tagbody
 a (print 'a) (if (zerop (random 2)) (go c))
 b (print 'b) (if (zerop (random 2)) (go a))
 c (print 'c) (if (zerop (random 2)) (go b)))

This form will jump around randomly printing as, bs, and cs until eventually the last **RANDOM** expression returns 1 and the control falls off the end of the **TAGBODY**.

**TAGBODY** is rarely used directly since it’s almost always easier to write iterative constructs in terms of the existing looping macros. It’s handy, however, for translating algorithms written in other languages into Common Lisp, either automatically or manually. An example of an automatic translation tool is the FORTRAN-to-Common Lisp translator, f2cl, that translates FORTRAN source code into Common Lisp in order to make various FORTRAN libraries available to Common Lisp programmers. Since many FORTRAN libraries were written before the structured programming revolution, they’re full of gotos. The f2cl compiler can simply translate those gotos to **GO**s within appropriate **TAGBODY**s.5

Similarly, **TAGBODY** and **GO** can be handy when translating algorithms described in prose or by flowcharts—for instance, in Donald Knuth’s classic series The Art of Computer Programming, he describes algorithms using a “recipe” format: step 1, do this; step 2, do that; step 3, go back to step 2; and so on. For example, on page 142 of The Art of Computer Programming, Volume 2: Seminumerical Algorithms, Third Edition (Addison-Wesley, 1998), he describes Algorithm S, which you’ll use in Chapter 27, in this form:

Algorithm S (Selection sampling technique). To select n records at random from a set of N, where 0 < n <= N.

S1. [Initialize.] Set t <— 0, m <— 0. (During this algorithm, m represents the number of records selected so far, and t is the total number of input records that we have dealt with.)

S2. [Generate U.] Generate a random number U, uniformly distributed between zero and one.

S3. [Test.] If (N - t)U >= n - m, go to step S5.

S4. [Select.] Select the next record for the sample, and increase m and t by 1. If m < n, go to step S2; otherwise the sample is complete and the algorithm terminates.

S5. [Skip.] Skip the next record (do not include it in the sample), increase t by 1, and go back to step S2.

This description can be easily translated into a Common Lisp function, after renaming a few variables, as follows:

(defun algorithm-s (n max) ; max is N in Knuth's algorithm
  (let (seen               ; t in Knuth's algorithm
        selected           ; m in Knuth's algorithm
        u                  ; U in Knuth's algorithm
        (records ()))      ; the list where we save the records selected
    (tagbody
     s1
       (setf seen 0)
       (setf selected 0)
     s2
       (setf u (random 1.0))
     s3
       (when (>= (* (- max seen) u) (- n selected)) (go s5))
     s4
       (push seen records)
       (incf selected)
       (incf seen)
       (if (< selected n)
           (go s2)
           (return-from algorithm-s (nreverse records)))
     s5
       (incf seen)
       (go s2))))

It’s not the prettiest code, but it’s easy to verify that it’s a faithful translation of Knuth’s algorithm. But, this code, unlike Knuth’s prose description, can be run and tested. Then you can start refactoring, checking after each change that the function still works.6

After pushing the pieces around a bit, you might end up with something like this:

(defun algorithm-s (n max)
  (loop for seen from 0
     when (< (* (- max seen) (random 1.0)) n)
     collect seen and do (decf n)
     until (zerop n)))

While it may not be immediately obvious that this code correctly implements Algorithm S, if you got here via a series of functions that all behave identically to the original literal translation of Knuth’s recipe, you’d have good reason to believe it’s correct.