Primitive Binary Types

While define-binary-class and define-tagged-binary-class make it easy to define composite structures, you still have to write read-value and write-value methods for primitive data types by hand. You could decide to live with that, specifying that users of the library need to write appropriate methods on read-value and write-value to support the primitive types used by their binary classes.

However, rather than having to document how to write a suitable read-value/write-value pair, you can provide a macro to do it automatically. This also has the advantage of making the abstraction created by define-binary-class less leaky. Currently, define-binary-class depends on having methods on read-value and write-value defined in a particular way, but that’s really just an implementation detail. By defining a macro that generates the read-value and write-value methods for primitive types, you hide those details behind an abstraction you control. If you decide later to change the implementation of define-binary-class, you can change your primitive-type-defining macro to meet the new requirements without requiring any changes to code that uses the binary data library.

So you should define one last macro, define-binary-type, that will generate read-value and write-value methods for reading values represented by instances of existing classes, rather than by classes defined with define-binary-class.

For a concrete example, consider a type used in the id3-tag class, a fixed-length string encoded in ISO-8859-1 characters. I’ll assume, as I did earlier, that the native character encoding of your Lisp is ISO-8859-1 or a superset, so you can use **CODE-CHAR** and **CHAR-CODE** to translate bytes to characters and back.

As always, your goal is to write a macro that allows you to express only the essential information needed to generate the required code. In this case, there are four pieces of essential information: the name of the type, iso-8859-1-string; the **&key** parameters that should be accepted by the read-value and write-value methods, length in this case; the code for reading from a stream; and the code for writing to a stream. Here’s an expression that contains those four pieces of information:

  1. (define-binary-type iso-8859-1-string (length)
  2. (:reader (in)
  3. (let ((string (make-string length)))
  4. (dotimes (i length)
  5. (setf (char string i) (code-char (read-byte in))))
  6. string))
  7. (:writer (out string)
  8. (dotimes (i length)
  9. (write-byte (char-code (char string i)) out))))

Now you just need a macro that can take apart this form and put it back together in the form of two **DEFMETHOD**s wrapped in a **PROGN**. If you define the parameter list to define-binary-type like this:

  1. (defmacro define-binary-type (name (&rest args) &body spec) ...

then within the macro the parameter spec will be a list containing the reader and writer definitions. You can then use **ASSOC** to extract the elements of spec using the tags :reader and :writer and then use **DESTRUCTURING-BIND** to take apart the **REST** of each element.10

From there it’s just a matter of interpolating the extracted values into the backquoted templates of the read-value and write-value methods.

  1. (defmacro define-binary-type (name (&rest args) &body spec)
  2. (with-gensyms (type)
  3. `(progn
  4. ,(destructuring-bind ((in) &body body) (rest (assoc :reader spec))
  5. `(defmethod read-value ((,type (eql ',name)) ,in &key ,@args)
  6. ,@body))
  7. ,(destructuring-bind ((out value) &body body) (rest (assoc :writer spec))
  8. `(defmethod write-value ((,type (eql ',name)) ,out ,value &key ,@args)
  9. ,@body)))))

Note how the backquoted templates are nested: the outermost template starts with the backquoted **PROGN** form. That template consists of the symbol **PROGN** and two comma-unquoted **DESTRUCTURING-BIND** expressions. Thus, the outer template is filled in by evaluating the **DESTRUCTURING-BIND** expressions and interpolating their values. Each **DESTRUCTURING-BIND** expression in turn contains another backquoted template, which is used to generate one of the method definitions to be interpolated in the outer template.

With this macro defined, the define-binary-type form given previously expands to this code:

  1. (progn
  2. (defmethod read-value ((#:g1618 (eql 'iso-8859-1-string)) in &key length)
  3. (let ((string (make-string length)))
  4. (dotimes (i length)
  5. (setf (char string i) (code-char (read-byte in))))
  6. string))
  7. (defmethod write-value ((#:g1618 (eql 'iso-8859-1-string)) out string &key length)
  8. (dotimes (i length)
  9. (write-byte (char-code (char string i)) out))))

Of course, now that you’ve got this nice macro for defining binary types, it’s tempting to make it do a bit more work. For now you should just make one small enhancement that will turn out to be pretty handy when you start using this library to deal with actual formats such as ID3 tags.

ID3 tags, like many other binary formats, use lots of primitive types that are minor variations on a theme, such as unsigned integers in one-, two-, three-, and four-byte varieties. You could certainly define each of those types with define-binary-type as it stands. Or you could factor out the common algorithm for reading and writing n-byte unsigned integers into helper functions.

But suppose you had already defined a binary type, unsigned-integer, that accepts a :bytes parameter to specify how many bytes to read and write. Using that type, you could specify a slot representing a one-byte unsigned integer with a type specifier of (unsigned-integer :bytes 1). But if a particular binary format specifies lots of slots of that type, it’d be nice to be able to easily define a new type—say, u1--that means the same thing. As it turns out, it’s easy to change define-binary-type to support two forms, a long form consisting of a :reader and :writer pair and a short form that defines a new binary type in terms of an existing type. Using a short form define-binary-type, you can define u1 like this:

  1. (define-binary-type u1 () (unsigned-integer :bytes 1))

which will expand to this:

  1. (progn
  2. (defmethod read-value ((#:g161887 (eql 'u1)) #:g161888 &key)
  3. (read-value 'unsigned-integer #:g161888 :bytes 1))
  4. (defmethod write-value ((#:g161887 (eql 'u1)) #:g161888 #:g161889 &key)
  5. (write-value 'unsigned-integer #:g161888 #:g161889 :bytes 1)))

To support both long- and short-form define-binary-type calls, you need to differentiate based on the value of the spec argument. If spec is two items long, it represents a long-form call, and the two items should be the :reader and :writer specifications, which you extract as before. On the other hand, if it’s only one item long, the one item should be a type specifier, which needs to be parsed differently. You can use **ECASE** to switch on the **LENGTH** of spec and then parse spec and generate an appropriate expansion for either the long form or the short form.

  1. (defmacro define-binary-type (name (&rest args) &body spec)
  2. (ecase (length spec)
  3. (1
  4. (with-gensyms (type stream value)
  5. (destructuring-bind (derived-from &rest derived-args) (mklist (first spec))
  6. `(progn
  7. (defmethod read-value ((,type (eql ',name)) ,stream &key ,@args)
  8. (read-value ',derived-from ,stream ,@derived-args))
  9. (defmethod write-value ((,type (eql ',name)) ,stream ,value &key ,@args)
  10. (write-value ',derived-from ,stream ,value ,@derived-args))))))
  11. (2
  12. (with-gensyms (type)
  13. `(progn
  14. ,(destructuring-bind ((in) &body body) (rest (assoc :reader spec))
  15. `(defmethod read-value ((,type (eql ',name)) ,in &key ,@args)
  16. ,@body))
  17. ,(destructuring-bind ((out value) &body body) (rest (assoc :writer spec))
  18. `(defmethod write-value ((,type (eql ',name)) ,out ,value &key ,@args)
  19. ,@body)))))))