The FOO Language
So, enough theory. I’ll give you a quick overview of the language implemented by FOO, and then you’ll look at the implementation of the two FOO language processors—the interpreter, in this chapter, and the compiler, in the next.
Like Lisp itself, the basic syntax of the FOO language is defined in terms of forms made up of Lisp objects. The language defines how each legal FOO form is translated into HTML.
The simplest FOO forms are self-evaluating Lisp objects such as strings, numbers, and keyword symbols.3 You’ll need a function self-evaluating-p
that tests whether a given object is self-evaluating for FOO’s purposes.
(defun self-evaluating-p (form)
(and (atom form) (if (symbolp form) (keywordp form) t)))
Objects that satisfy this predicate will be emitted by converting them to strings with **PRINC-TO-STRING**
and then escaping any reserved characters, such as <
, >
, or &
. When the value is being emitted as an attribute, the characters "
, and '
are also escaped. Thus, you can invoke the html
macro on a self-evaluating object to emit it to *html-output*
(which is initially bound to ***STANDARD-OUTPUT***
). Table 30-1 shows how a few different self-evaluating values will be output.
Table 30-1. FOO Output for Self-Evaluating Objects
FOO Form | Generated HTML |
“foo” | foo |
10 | 10 |
:foo | FOO |
“foo & bar” | foo & bar |
Of course, most HTML consists of tagged elements. The three pieces of information that describe each element are the tag, a set of attributes, and a body containing text and/or more HTML elements. Thus, you need a way to represent these three pieces of information as Lisp objects, preferably ones that the Lisp reader already knows how to read.4 If you forget about attributes for a moment, there’s an obvious mapping between Lisp lists and HTML elements: any HTML element can be represented by a list whose **FIRST**
is a symbol where the name is the name of the element’s tag and whose **REST**
is a list of self-evaluating objects or lists representing other HTML elements. Thus:
<p>Foo</p> <==> (:p "Foo")
<p><i>Now</i> is the time</p> <==> (:p (:i "Now") " is the time")
Now the only problem is where to squeeze in the attributes. Since most elements have no attributes, it’d be nice if you could use the preceding syntax for elements without attributes. FOO provides two ways to notate elements with attributes. The first is to simply include the attributes in the list immediately following the symbol, alternating keyword symbols naming the attributes and objects representing the attribute value forms. The body of the element starts with the first item in the list that’s in a position to be an attribute name and isn’t a keyword symbol. Thus:
HTML> (html (:p "foo"))
<p>foo</p>
NIL
HTML> (html (:p "foo " (:i "bar") " baz"))
<p>foo <i>bar</i> baz</p>
NIL
HTML> (html (:p :style "foo" "Foo"))
<p style='foo'>Foo</p>
NIL
HTML> (html (:p :id "x" :style "foo" "Foo"))
<p id='x' style='foo'>Foo</p>
NIL
For folks who prefer a bit more obvious delineation between the element’s attributes and its body, FOO supports an alternative syntax: if the first element of a list is itself a list with a keyword as its first element, then the outer list represents an HTML element with that keyword indicating the tag, with the **REST**
of the nested list as the attributes, and with the **REST**
of the outer list as the body. Thus, you could write the previous two expressions like this:
HTML> (html ((:p :style "foo") "Foo"))
<p style='foo'>Foo</p>
NIL
HTML> (html ((:p :id "x" :style "foo") "Foo"))
<p id='x' style='foo'>Foo</p>
NIL
The following function tests whether a given object matches either of these syntaxes:
(defun cons-form-p (form &optional (test #'keywordp))
(and (consp form)
(or (funcall test (car form))
(and (consp (car form)) (funcall test (caar form))))))
You should parameterize the test
function because later you’ll need to test the same two syntaxes with a slightly different predicate on the name.
To completely abstract the differences between the two syntax variants, you can define a function, parse-cons-form
, that takes a form and parses it into three elements, the tag, the attributes plist, and the body list, returning them as multiple values. The code that actually evaluates cons forms will use this function and not have to worry about which syntax was used.
(defun parse-cons-form (sexp)
(if (consp (first sexp))
(parse-explicit-attributes-sexp sexp)
(parse-implicit-attributes-sexp sexp)))
(defun parse-explicit-attributes-sexp (sexp)
(destructuring-bind ((tag &rest attributes) &body body) sexp
(values tag attributes body)))
(defun parse-implicit-attributes-sexp (sexp)
(loop with tag = (first sexp)
for rest on (rest sexp) by #'cddr
while (and (keywordp (first rest)) (second rest))
when (second rest)
collect (first rest) into attributes and
collect (second rest) into attributes
end
finally (return (values tag attributes rest))))
Now that you have the basic language specified, you can think about how you’re actually going to implement the language processors. How do you get from a series of FOO forms to the desired HTML? As I mentioned previously, you’ll be implementing two language processors for FOO: an interpreter that walks a tree of FOO forms and emits the corresponding HTML directly and a compiler that walks a tree and translates it into Common Lisp code that’ll emit the same HTML. Both the interpreter and compiler will be built on top of a common foundation of code, which provides support for things such as escaping reserved characters and generating nicely indented output, so it makes sense to start there.