ID3 Frames
As I discussed earlier, the bulk of an ID3 tag is divided into frames. Each frame has a structure similar to that of the tag as a whole. Each frame starts with a header indicating what kind of frame it is and the size of the frame in bytes. The structure of the frame header changed slightly between version 2.2 and version 2.3 of the ID3 format, and eventually you’ll have to deal with both forms. To start, you can focus on parsing version 2.2 frames.
The header of a 2.2 frame consists of three bytes that encode a three-character ISO 8859-1 string followed by a three-byte unsigned integer, which specifies the size of the frame in bytes, excluding the six-byte header. The string identifies what type of frame it is, which determines how you parse the data following the size. This is exactly the kind of situation for which you defined the define-tagged-binary-class
macro. You can define a tagged class that reads the frame header and then dispatches to the appropriate concrete class using a function that maps IDs to a class names.
(define-tagged-binary-class id3-frame ()
((id (iso-8859-1-string :length 3))
(size u3))
(:dispatch (find-frame-class id)))
Now you’re ready to start implementing concrete frame classes. However, the specification defines quite a few—63 in version 2.2 and even more in later specs. Even considering frame types that share a common structure to be equivalent, you’ll still find 24 unique frame types in version 2.2. But only a few of these are used “in the wild.” So rather than immediately setting to work defining classes for each of the frame types, you can start by writing a generic frame class that lets you read the frames in a tag without parsing the data within the frames themselves. This will give you a way to find out what frames are actually present in the MP3s you want to process. You’ll need this class eventually anyway because the specification allows for experimental frames that you’ll need to be able to read without parsing.
Since the size field of the frame header tells you exactly how many bytes long the frame is, you can define a generic-frame
class that extends id3-frame
and adds a single field, data
, that will hold an array of bytes.
(define-binary-class generic-frame (id3-frame)
((data (raw-bytes :size size))))
The type of the data field, raw-bytes
, just needs to hold an array of bytes. You can define it like this:
(define-binary-type raw-bytes (size)
(:reader (in)
(let ((buf (make-array size :element-type '(unsigned-byte 8))))
(read-sequence buf in)
buf))
(:writer (out buf)
(write-sequence buf out)))
For the time being, you’ll want all frames to be read as generic-frame
s, so you can define the find-frame-class
function used in id3-frame
‘s :dispatch
expression to always return generic-frame
, regardless of the frame’s id
.
(defun find-frame-class (id)
(declare (ignore id))
'generic-frame)
Now you need to modify id3-tag
so it’ll read frames after the header fields. There’s only one tricky bit to reading the frame data: although the tag header tells you how many bytes long the tag is, that number includes the padding that can follow the frame data. Since the tag header doesn’t tell you how many frames the tag contains, the only way to tell when you’ve hit the padding is to look for a null byte where you’d expect a frame identifier.
To handle this, you can define a binary type, id3-frames
, that will be responsible for reading the remainder of a tag, creating frame objects to represent all the frames it finds, and then skipping over any padding. This type will take as a parameter the tag size, which it can use to avoid reading past the end of the tag. But the reading code will also need to detect the beginning of the padding that can follow the tag’s frame data. Rather than calling read-value
directly in id3-frames
:reader
, you should use a function read-frame
, which you’ll define to return **NIL**
when it detects padding, otherwise returning an id3-frame
object read using read-value
. Assuming you define read-frame
so it reads only one byte past the end of the last frame in order to detect the start of the padding, you can define the id3-frames
binary type like this:
(define-binary-type id3-frames (tag-size)
(:reader (in)
(loop with to-read = tag-size
while (plusp to-read)
for frame = (read-frame in)
while frame
do (decf to-read (+ 6 (size frame)))
collect frame
finally (loop repeat (1- to-read) do (read-byte in))))
(:writer (out frames)
(loop with to-write = tag-size
for frame in frames
do (write-value 'id3-frame out frame)
(decf to-write (+ 6 (size frame)))
finally (loop repeat to-write do (write-byte 0 out)))))
You can use this type to add a frames
slot to id3-tag
.
(define-binary-class id3-tag ()
((identifier (iso-8859-1-string :length 3))
(major-version u1)
(revision u1)
(flags u1)
(size id3-tag-size)
(frames (id3-frames :tag-size size))))