Characters(字符)
Common Lisp characters are a distinct type of object from numbers. That’s as it should be—characters are not numbers, and languages that treat them as if they are tend to run into problems when character encodings change, say, from 8-bit ASCII to 21-bit Unicode.11 Because the Common Lisp standard didn’t mandate a particular representation for characters, today several Lisp implementations use Unicode as their “native” character encoding despite Unicode being only a gleam in a standards body’s eye at the time Common Lisp’s own standardization was being wrapped up.
Common Lisp 字符和数字是不同类型的对象。其本该如此——字符不是数字,而将其同等对待的语言当字符编码改变时(比如说从 8 位 ASCII 到 21 位 Unicode)可能会出现问题。由于 Common Lisp 标准并未规定字符的内部表示方法,当今几种 Lisp 实现都使用 Unicode 作为其原生字符编码,尽管从标准化组织的观点来看 Unicode 在 Common Lisp 自身的标准化成型时期只是昙花一现。
The read syntax for characters objects is simple: #\
followed by the desired character. Thus, #\x
is the character x
. Any character can be used after the #\
, including otherwise special characters such as "
, (
, and whitespace. However, writing whitespace characters this way isn’t very (human) readable; an alternative syntax for certain characters is #\
followed by the character’s name. Exactly what names are supported depends on the character set and on the Lisp implementation, but all implementations support the names Space
and Newline
. Thus, you should write #\Space
instead of #\
, though the latter is technically legal. Other semistandard names (that implementations must use if the character set has the appropriate characters) are Tab
, Page
, Rubout
, Linefeed
, Return
, and Backspace
.
字符的读取语法很简单:#\
后跟想要的字符。这样,#\x
就是字符 x
。任何字符都可以用在 #\
之后,包括那些诸如 "
、(
和空格这样的特殊字符。但以这种方式来写空格字符却并不是十分的人类可读,特定字符的替代语法是 #\
后跟该字符的名字。具体支持的名字取决于字符集和所在的 Lisp 实现,但所有实现都支持名字 Space
和 Newline
。这样就应该写成用 #\Space
来代替 #\
,尽管后者在技术上是合法的。其他半标准化的名字(如果字符集包含相应的字符实现就必须采用的名字)是 Tab
、Page
、Rubout
、Linefeed
、Return
和 Backspace
。