此页面尚未翻译。要贡献翻译,请点击此处

This document provides a gentle introduction to the data structures and formats that define the certificates used in HTTPS. It should be accessible to anyone with a little bit of computer science experience and a bit of familiarity with certificates.

An HTTPS certificate is a type of file, like any other file. Its contents follow a format defined by RFC 5280. The definitions are expressed in ASN.1, which is a language used to define file formats or (equivalently) data structures. For instance, in C you might write:

  1. struct point {
  2. int x, y;
  3. char label[10];
  4. };

In Go you would write:

  1. type point struct {
  2. x, y int
  3. label string
  4. }

And in ASN.1 you would write:

  1. Point ::= SEQUENCE {
  2. x INTEGER,
  3. y INTEGER,
  4. label UTF8String
  5. }

The advantage of writing ASN.1 definitions instead of Go or C definitions is that they are language-independent. You can implement the ASN.1 definition of Point in any language, or (preferably) you can use a tool that takes the ASN.1 definition and automatically generates code implementing it in your favorite language. A set of ASN.1 definitions is called a “module.”

The other important thing about ASN.1 is that it comes with a variety of serialization formats— ways to turn an in-memory data structure into a series of bytes (or a file) and back again. This allows a certificate generated by one machine to be read by a different machine, even if that machine is using a different CPU and operating system.

There are some other languages that do the same things as ASN.1. For instance, Protocol Buffers offer both a language for defining types and a serialization format for encoding objects of the types you’ve defined. Thrift also has both a language and a serialization format. Either Protocol Buffers or Thrift could have just as easily been used to define the format for HTTPS certificates, but ASN.1 (1984) had the significant advantage of already existing when certificates (1988) and HTTPS (1994) were invented.

ASN.1 has been revised multiple times through the years, with editions usually identified by the year they were published. This document aims to teach enough ASN.1 to clearly understand RFC 5280 and other standards related to HTTPS certificates, so we’ll mainly talk about the 1988 edition, with a few notes on features that were added in later editions. You can download the various editions directly from ITU, with the caveat that some are only available to ITU members. The relevant standards are X.680 (defining the ASN.1 language) and X.690 (defining the serialization formats DER and BER). Earlier versions of those standards were X.208 and X.209, respectively.

ASN.1’s main serialization format is “Distinguished Encoding Rules” (DER). They are a variant of “Basic Encoding Rules” (BER) with canonicalization added. For instance, if a type includes a SET OF, the members must be sorted for DER serialization.

A certificate represented in DER is often further encoded into PEM, which uses base64 to encode arbitrary bytes as alphanumeric characters (and ‘+’ and ‘/‘) and adds separator lines (“——-BEGIN CERTIFICATE——-” and “——-END CERTIFICATE——-“). PEM is useful because it’s easier to copy-paste.

This document will first describe the types and notation used by ASN.1, and will then describe how objects defined using ASN.1 are encoded. Feel free to flip back and forth between the sections, particularly since some features of the ASN.1 language directly specify encoding details. This document prefers more familiar terms, and so uses “byte” in place of “octet,” and “value” in place of “contents.” It uses “serialization” and “encoding” interchangeably.

The Types

INTEGER

Good old familiar INTEGER. These can be positive or negative. What’s really unusual about ASN.1 INTEGERs is that they can be arbitrarily big. Not enough room in an int64? No problem. This is particularly handy for representing things like an RSA modulus, which is much bigger than an int64 (like 22048 big). Technically there is a maximum integer in DER but it’s extraordinarily large: The length of any DER field can be expressed as a series of up to 126 bytes. So the biggest INTEGER you can represent in DER is 256(2**1008)-1. For a truly unbounded INTEGER you’d have to encode in BER, which allows indefinitely-long fields.

Strings

ASN.1 has a lot of string types: BMPString, GeneralString, GraphicString, IA5String, ISO646String, NumericString, PrintableString, TeletexString, T61String, UniversalString, UTF8String, VideotexString, and VisibleString. For the purposes of HTTPS certificates you mostly have to care about PrintableString, UTF8String, and IA5String. The string type for a given field is defined by the ASN.1 module that defines the field. For instance:

  1. CPSuri ::= IA5String

PrintableString is a restricted subset of ASCII, allowing alphanumerics, spaces, and a specific handful of punctuation: ' () + , - . / : = ?. Notably it doesn’t include * or @. There are no storage-size benefits to more restrictive string types.

Some fields, like DirectoryString in RFC 5280, allow the serialization code to choose among multiple string types. Since DER encoding includes the type of string you’re using, make sure that when you encode something as PrintableString it really meets the PrintableString requirements.

IA5String, based on International Alphabet No. 5, is more permissive: It allows nearly any ASCII character, and is used for email address, DNS names, and URLs in certificates. Note that there are a few byte values where the IA5 meaning of the byte value is different than the US-ASCII meaning of that same value.

TeletexString, BMPString, and UniversalString are deprecated for use in HTTPS certificates, but you may see them when parsing older CA certificates, which are long-lived and may predate the deprecation.

Strings in ASN.1 are not null-terminated like strings in C and C++. In fact, it’s perfectly legal to have embedded null bytes. This can cause vulnerabilities when two systems interpret the same ASN.1 string differently. For instance, some CAs used to be able to be tricked into issuing for “example.com\0.evil.com” on the basis of ownership of evil.com. Certificate validation libraries at the time treated the result as valid for “example.com”. Be very careful handling ASN.1 strings in C and C++ to avoid creating vulnerabilities.

Dates and Times

Again, lots of time types: UTCTime, GeneralizedTime, DATE, TIME-OF-DAY, DATE-TIME and DURATION. For HTTPS certificates you only have to care about UTCTime and GeneralizedTime.

UTCTime represents a date and time as YYMMDDhhmm[ss], with an optional timezone offset or “Z” to represent Zulu (aka UTC aka 0 timezone offset). For instance the UTCTimes 820102120000Z and 820102070000-0500 both represent the same time: January 2nd, 1982, at 7am in New York City (UTC-5) and at 12pm in UTC.

Since UTCTime is ambiguous as to whether it’s the 1900’s or 2000’s, RFC 5280 clarifies that it represents dates from 1950 to 2050. RFC 5280 also requires that the “Z” timezone must be used and seconds must be included.

GeneralizedTime supports dates after 2050 through the simple expedient of representing the year with four digits. It also allows fractional seconds (weirdly, with either a comma or a full stop as the decimal separator). RFC 5280 forbids fractional seconds and requires the “Z.”

OBJECT IDENTIFIER

Object identifiers are globally unique, hierarchical identifiers made of a sequence of integers. They can refer to any kind of “thing,” but are commonly used to identify standards, algorithms, certificate extensions, organizations, or policy documents. As an example: 1.2.840.113549 identifies RSA Security LLC. RSA can then assign OIDs starting with that prefix, like 1.2.840.113549.1.1.11, which identifies sha256WithRSAEncryption, as defined in RFC 8017.

Similarly, 1.3.6.1.4.1.11129 identifies Google, Inc. Google assigned 1.3.6.1.4.1.11129.2.4.2 to identify the SCT list extension used in Certificate Transparency (which was initially developed at Google), as defined in RFC 6962.

The set of child OIDs that can exist under a given prefix is called an “OID arc.” Since the representation of shorter OIDs is smaller, OID assignments under shorter arcs are considered more valuable, particularly for formats where that OID will have to be sent a lot. The OID arc 2.5 is assigned to “Directory Services,” the series of specifications that includes X.509, which HTTPS certificates are based on. A lot of fields in certificates begin with that conveniently short arc. For instance, 2.5.4.6 means “countryName,” while 2.5.4.10 means “organizationName.” Since most certificates have to encode each of those OIDs at least once, it’s handy that they are short.

OIDs in specifications are commonly represented with a human-readable name for convenience, and may be specified by concatenation with another OID. For instance from RFC 8017:

  1. pkcs-1 OBJECT IDENTIFIER ::= {
  2. iso(1) member-body(2) us(840) rsadsi(113549) pkcs(1) 1
  3. }
  4. ...
  5. sha256WithRSAEncryption OBJECT IDENTIFIER ::= { pkcs-1 11 }

NULL

NULL is just NULL, ya know?

SEQUENCE and SEQUENCE OF

Don’t let the names fool you: These are two very different types. A SEQUENCE is equivalent to “struct” in most programming languages. It holds a fixed number of fields of different types. For instance, see the Certificate example below.

A SEQUENCE OF, on the other hand, holds an arbitrary number of fields of a single type. This is analogous to an array or a list in a programming language. For instance:

  1. RDNSequence ::= SEQUENCE OF RelativeDistinguishedName

That could be 0, 1, or 7,000 RelativeDistinguishedNames, in a specific order.

It turns out SEQUENCE and SEQUENCE OF do have one similarity - they are both encoded the same way! More on that in the Encoding section.

SET and SET OF

These are pretty much the same as SEQUENCE and SEQUENCE OF, except that there are intentionally no semantics attached to the ordering of elements in them. However, in encoded form they must be sorted. An example:

  1. RelativeDistinguishedName ::=
  2. SET SIZE (1..MAX) OF AttributeTypeAndValue

Note: This example uses the SIZE keyword to additionally specify that RelativeDistinguishedName must have at least one member, but in general a SET or SET OF is allowed to have a size of zero.

BIT STRING and OCTET STRING

These contain arbitrary bits or bytes respectively. These can be used to hold unstructured data, like nonces or hash function output. They can also be used like a void pointer in C or the empty interface type (interface{}) in Go: A way to hold data that does have a structure, but where that structure is understood or defined separately from the type system. For instance, the signature on a certificate is defined as a BIT STRING:

  1. Certificate ::= SEQUENCE {
  2. tbsCertificate TBSCertificate,
  3. signatureAlgorithm AlgorithmIdentifier,
  4. signature BIT STRING }

Later versions of the ASN.1 language allow more detailed specification of the contents inside the BIT STRING (and the same is true of OCTET STRINGs).

CHOICE and ANY

CHOICE is a type that can contain exactly one of the types listed in its definition. For instance, Time can contain exactly one of a UTCTime or a GeneralizedTime:

  1. Time ::= CHOICE {
  2. utcTime UTCTime,
  3. generalTime GeneralizedTime }

ANY indicates that a value can be of any type. In practice, it is usually constrained by things that can’t quite be expressed in the ASN.1 grammar. For instance:

  1. AttributeTypeAndValue ::= SEQUENCE {
  2. type AttributeType,
  3. value AttributeValue }
  4. AttributeType ::= OBJECT IDENTIFIER
  5. AttributeValue ::= ANY -- DEFINED BY AttributeType

This is particularly useful for extensions, where you want to leave room for additional fields to be defined separately after the main specification is published, so you have a way to register new types (object identifiers), and allow the definitions for those types to specify what the structure of the new fields should be.

Note that ANY is a relic of the 1988 ASN.1 notation. In the 1994 edition, ANY was deprecated and replaced with Information Object Classes, which are a fancy, formalized way of specifying the kind of extension behavior people wanted from ANY. The change is so old by now that the latest ASN.1 specifications (from 2015) don’t even mention ANY. But if you look at the 1994 edition you can see some discussion of the switchover. I include the older syntax here because that’s still what RFC 5280 uses. RFC 5912 uses the 2002 ASN.1 syntax to express the same types from RFC 5280 and several related specifications.

Other Notation

Comments begin with --. Fields of a SEQUENCE or SET can be marked OPTIONAL, or they can be marked DEFAULT foo, which means the same thing as OPTIONAL except that when the field is absent it should be considered to contain “foo.” Types with a length (strings, octet and bit strings, sets and sequences OF things) can be given a SIZE parameter that constrains their length, either to an exact length or to a range.

Types can be constrained to have certain values by using curly braces after the type definition. This example defines that the Version field can have three values, and assigns meaningful names to those values:

  1. Version ::= INTEGER { v1(0), v2(1), v3(2) }

This is also often used in assigning names to specific OIDs (note this is a single value, with no commas indicating alternate values). Example from RFC 5280.

  1. id-pkix OBJECT IDENTIFIER ::=
  2. { iso(1) identified-organization(3) dod(6) internet(1)
  3. security(5) mechanisms(5) pkix(7) }

You’ll also see [number], IMPLICIT, EXPLICIT, UNIVERSAL, and APPLICATION. These define details of how a value should be encoded, which we’ll talk about below.

The Encoding

ASN.1 is associated with many encodings: BER, DER, PER, XER, and more. Basic Encoding Rules (BER) are fairly flexible. Distinguished Encoding Rules (DER) are a subset of BER with canonicalization rules so there is only one way to express a given structure. Packed Encoding Rules (PER) use fewer bytes to encode things, so they are useful when space or transmission time is at a premium. XML Encoding Rules (XER) are useful when for some reason you want to use XML.

HTTPS certificates are generally encoded in DER. It’s possible to encode them in BER, but since the signature value is calculated over the equivalent DER encoding, not the exact bytes in the certificate, encoding a certificate in BER invites unnecessary trouble. I’ll describe BER, and explain as I go the additional restrictions provided by DER.

I encourage you to read this section with this decoding of a real certificate open in another window.

Type-Length-Value

BER is a type-length-value encoding, just like Protocol Buffers and Thrift. That means that, as you read bytes that are encoded with BER, first you encounter a type, called in ASN.1 a tag. This is a byte, or series of bytes, that tells you what type of thing is encoded: an INTEGER, or a UTF8String, or a structure, or whatever else.

typelengthvalue
020301 00 01

Next you encounter a length: a number that tells you how many bytes of data you’re going to need to read in order to get the value. Then, of course, comes the bytes containing the value itself. As an example, the hex bytes 02 03 01 00 01 would represent an INTEGER (tag 02 corresponds to the INTEGER type), with length 03, and a three-byte value consisting of 01 00 01.

Type-length-value is distinguished from delimited encodings like JSON, CSV, or XML, where instead of knowing the length of a field up front, you read bytes until you hit the expected delimiter (e.g. } in JSON, or </some-tag> in XML).

Tag

The tag is usually one byte. There is a means to encode arbitrarily large tag numbers using multiple bytes (the “high tag number” form), but this is not typically necessary.

Here are some example tags:

Tag (decimal)Tag (hex)Type
202INTEGER
303BIT STRING
404OCTET STRING
505NULL
606OBJECT IDENTIFIER
120CUTF8String
1610 (and 30)SEQUENCE and SEQUENCE OF
1711 (and 31)SET and SET OF
1913PrintableString
2216IA5String
2317UTCTime
2418GeneralizedTime

These, and a few others I’ve skipped for being boring, are the “universal” tags, because they are specified in the core ASN.1 specification and mean the same thing across all ASN.1 modules.

These tags all happen to be under 31 (0x1F), and that’s for a good reason: Bits 8, 7, and 6 (the high bits of the tag byte) are used to encode extra information, so any universal tag numbers higher than 31 would need to use the “high tag number” form, which takes extra bytes. There are a small handful of universal tags higher than 31, but they’re quite rare.

The two tags marked with a * are always encoded as 0x30 or 0x31, because bit 6 is used to indicate whether a field is Constructed vs Primitive. These tags are always Constructed, so their encoding has bit 6 set to 1. See the Constructed vs Primitive section for details.

Tag Classes

Just because the universal class has used up all the “good” tag numbers, that doesn’t mean we’re out of luck for defining our own tags. There are also the “application,” “private”, and “context-specific” classes. These are distinguished by bits 8 and 7:

ClassBit 8Bit 7
Universal00
Application01
Context-specific10
Private11

Specifications mostly use tags in the universal class, since they provide the most important building blocks. For instance, the serial number in a certificate is encoded in a plain ol’ INTEGER, tag number 0x02. But sometimes a specification needs to define tags in the context-specific class to disambiguate entries in a SET or SEQUENCE that defines optional entries, or to disambiguate a CHOICE with multiple entries that have the same type. For instance, take this definition:

  1. Point ::= SEQUENCE {
  2. x INTEGER OPTIONAL,
  3. y INTEGER OPTIONAL
  4. }

Since OPTIONAL fields are omitted entirely from the encoding when they’re not present, it would be impossible to distinguish a Point with only an x coordinate from a Point with only a y coordinate. For instance you’d encode a Point with only an x coordinate of 9 like so (30 means SEQUENCE here):

  1. 30 03 02 01 09

That’s a SEQUENCE of length 3 (bytes), containing an INTEGER of length 1, which has the value 9. But you’d also encode a Point with a y coordinate of 9 exactly the same way, so there is ambiguity.

Encoding Instructions

To resolve this ambiguity, a specification needs to provide encoding instructions that assign a unique tag to each entry. And because we’re not allowed to stomp on the UNIVERSAL tags, we have to use one of the others, for instance APPLICATION:

  1. Point ::= SEQUENCE {
  2. x [APPLICATION 0] INTEGER OPTIONAL,
  3. y [APPLICATION 1] INTEGER OPTIONAL
  4. }

Though for this use case, it’s actually much more common to use the context-specific class, which is represented by a number in brackets by itself:

  1. Point ::= SEQUENCE {
  2. x [0] INTEGER OPTIONAL,
  3. y [1] INTEGER OPTIONAL
  4. }

So now, to encode a Point with just an x coordinate of 9, instead of encoding x as a UNIVERSAL INTEGER, you’d sets bit 8 and 7 of the encoded tag to (1, 0) to indicate the context specific class, and set the low bits to 0, giving this encoding:

  1. 30 03 80 01 09

And to represent a Point with just a y coordinate of 9, you’d do the same thing, except you’d set the low bits to 1:

  1. 30 03 81 01 09

Or you could represent a Point with x and y coordinate both equal to 9:

  1. 30 06 80 01 09 81 01 09

Length

The length in the tag-length-value tuple always represents the total number of bytes in the object including all sub-objects. So a SEQUENCE with one field doesn’t have a length of 1; it has a length of however many bytes the encoded form of that field take up.

The encoding of length can take two forms: short or long. The short form is a single byte, between 0 and 127.

The long form is at least two bytes long, and has bit 8 of the first byte set to 1. Bits 7-1 of the first byte indicate how many more bytes are in the length field itself. Then the remaining bytes specify the length itself, as a multi-byte integer.

As you can imagine, this allows very long values. The longest possible length would start with the byte 254 (a length byte of 255 is reserved for future extensions), specifying that 126 more bytes would follow in the length field alone. If each of those 126 bytes was 255, that would indicate 21008-1 bytes to follow in the value field.

The long form allows you to encode the same length multiple ways - for instance by using two bytes to express a length that could fit in one, or by using long form to express a length that could fit in the short form. DER says to always use the smallest possible length representation.

Safety warning: Don’t fully trust the length values that you decode! For instance, check that the encoded length is less than the amount of data available from the stream being decoded.

Indefinite length

It’s also possible, in BER, to encode a string, SEQUENCE, SEQUENCE OF, SET, or SET OF where you don’t know the length in advance (for instance when streaming output). To do this, you encode the length as a single byte with the value 80, and encode the value as a series of encoded objects concatenated together, with the end indicated by the two bytes 00 00 (which can be considered as a zero-length object with tag 0). So, for instance, the indefinite length encoding of a UTF8String would be the encoding of one or more UTF8Strings concatenated together, and concatenated finally with 00 00.

Indefinite-ness can be arbitrarily nested! So, for example, the UTF8Strings that you concatenate together to form an indefinite-length UTF8String can themselves be encoded either with definite length or indefinite length.

A length byte of 80 is distinguishing because it’s not a valid short form or long form length. Since bit 8 is set to 1, this would normally be interpreted as the long form, but the remaining bits are supposed to indicate the number of additional bytes that make up the length. Since bits 7-1 are all 0, that would indicate a long-form encoding with zero bytes making up the length, which is not allowed.

DER forbids indefinite length encoding. You must use the definite length encoding (that is, with the length specified at the beginning).

Constructed vs Primitive

Bit 6 of the first tag byte is used to indicate whether the value is encoded in primitive form or constructed form. Primitive encoding represents the value directly - for instance, in a UTF8String the value would consist solely of the string itself, in UTF-8 bytes. Constructed encoding represents the value as a concatenation of other encoded values. For instance, as described in the “Indefinite length” section, a UTF8String in constructed encoding would consist of multiple encoded UTF8Strings (each with a tag and length), concatenated together. The length of the overall UTF8String would be the total length, in bytes, of all those concatenated encoded values. Constructed encoding can use either definite or indefinite length. Primitive encoding always uses definite length, because there’s no way to express indefinite length without using constructed encoding.

INTEGER, OBJECT IDENTIFIER, and NULL must use primitive encoding. SEQUENCE, SEQUENCE OF, SET, and SET OF must use constructed encoding (because they are inherently concatenations of multiple values). BIT STRING, OCTET STRING, UTCTime, GeneralizedTime, and the various string types can use either primitive encoding or constructed encoding, at the sender’s discretion— in BER. However, in DER all types that have an encoding choice between primitive and constructed must use the primitive encoding.

EXPLICIT vs IMPLICIT

The encoding instructions described above, e.g. [1], or [APPLICATION 8], can also include the keyword EXPLICIT or IMPLICIT (example from RFC 5280):

  1. TBSCertificate ::= SEQUENCE {
  2. version [0] Version DEFAULT v1,
  3. serialNumber CertificateSerialNumber,
  4. signature AlgorithmIdentifier,
  5. issuer Name,
  6. validity Validity,
  7. subject Name,
  8. subjectPublicKeyInfo SubjectPublicKeyInfo,
  9. issuerUniqueID [1] IMPLICIT UniqueIdentifier OPTIONAL,
  10. -- If present, version MUST be v2 or v3
  11. subjectUniqueID [2] IMPLICIT UniqueIdentifier OPTIONAL,
  12. -- If present, version MUST be v2 or v3
  13. extensions [3] Extensions OPTIONAL
  14. -- If present, version MUST be v3 -- }

This defines how the tag should be encoded; it doesn’t have to do with whether the tag number is explicitly assigned or not (since both IMPLICIT and EXPLICIT always go alongside a specific tag number). IMPLICIT encodes the field just like the underlying type, but with the tag number and class provided in the ASN.1 module. EXPLICIT encodes the field as the underlying type, and then wraps that in an outer encoding. The outer encoding has the tag number and class from the ASN.1 module and additionally has the Constructed bit set.

Here’s an example ASN.1 encoding instruction using IMPLICIT:

  1. [5] IMPLICIT UTF8String

This would encode “hi” as:

  1. 85 02 68 69

Compare to this ASN.1 encoding instruction using EXPLICIT:

  1. [5] EXPLICIT UTF8String

This would encode “hi” as:

  1. A5 04 0C 02 68 69

When the IMPLICIT or EXPLICIT keyword is not present, the default is EXPLICIT, unless the module sets a different default at the top with “EXPLICIT TAGS,” “IMPLICIT TAGS,” or “AUTOMATIC TAGS.” For instance, RFC 5280 defines two modules, one where EXPLICIT tags are the default, and a second one that imports the first, and has IMPLICIT tags as the default. Implicit encoding uses fewer bytes than explicit encoding.

AUTOMATIC TAGS is the same as IMPLICIT TAGS, but with additional property that tag numbers ([0], [1], etc) are automatically assigned in places that need them, like SEQUENCEs with optional fields.

Encoding of specific types

In this section we’ll talk about how the value of each type is encoded, with examples.

INTEGER encoding

Integers are encoded as one or more bytes, in two’s complement with the high bit (bit 8) of the leftmost byte as the sign bit. As the BER specification says:

The value of a two’s complement binary number is derived by numbering the bits in the contents octets, starting with bit 1 of the last octet as bit zero and ending the numbering with bit 8 of the first octet. Each bit is assigned a numerical value of 2N, where N is its position in the above numbering sequence. The value of the two’s complement binary number is obtained by summing the numerical values assigned to each bit for those bits which are set to one, excluding bit 8 of the first octet, and then reducing this value by the numerical value assigned to bit 8 of the first octet if that bit is set to one.

So for instance this one-byte value (represented in binary) encodes decimal 50:

00110010 (== decimal 50)

This one-byte value (represented in binary) encodes decimal -100:

10011100 (== decimal -100)

This five-bytes value (represented in binary) encodes decimal -549755813887 (i.e. -239 + 1):

10000000 00000000 00000000 00000000 00000001 (== decimal -549755813887)

BER and DER both require that integers be represented in the shortest form possible. That is enforced with this rule:

  1. ... the bits of the first octet and bit 8 of the second octet:
  2. 1. shall not all be ones; and
  3. 2. shall not all be zero.

Rule (2) roughly means: if there are leading zero bytes in the encoding you could just as well leave them off and have the same number. Bit 8 of the second byte is important here too because if you want to represent certain values, you must use a leading zero byte. For instance, decimal 255 is encoded as two bytes:

00000000 11111111

That’s because a single-byte encoding of 11111111 by itself means -1 (bit 8 is treated as the sign bit).

Rule (1) is best explained with an example. Decimal -128 is encoded as:

10000000 (== decimal -128)

However, that could also be encoded as:

11111111 10000000 (== decimal -128, but an invalid encoding)

Expanding that out, it’s -215 + 214 + 213 + 212 + 211 + 210 + 29 + 28 + 27 == -27 == -128. Note that the 1 in “10000000” was a sign bit in the single-byte encoding, but means 27 in the two-byte encoding.

This is a generic transform: For any negative number encoded as BER (or DER) you could prefix it with 11111111 and get the same number. This is called sign extension. Or equivalently, if there’s a negative number where the encoding of the value begins with 11111111, you could remove that byte and still have the same number. So BER and DER require the shortest encoding.

The two’s complement encoding of INTEGERs has practical impact in certificate issuance: RFC 5280 requires that serial numbers be positive. Since the first bit is always a sign bit, that means serial numbers encoded in DER as 8 bytes can be at most 63 bits long. Encoding a 64-bit positive serial number requires a 9-byte encoded value (with the first byte being zero).

Here’s the encoding of an INTEGER with the value 263+1 (which happens to be a 64-bit positive number):

  1. 02 09 00 80 00 00 00 00 00 00 01

String encoding

Strings are encoded as their literal bytes. Since IA5String and PrintableString just define different subsets of acceptable characters, their encodings differ only by tag.

A PrintableString containing “hi”:

  1. 13 02 68 69

An IA5String containing “hi”:

  1. 16 02 68 69

UTF8Strings are the same, but can encode a wider variety of characters. For instance, this is the encoding of a UTF8String containing U+1F60E Smiling Face With Sunglasses (😎):

  1. 0c 04 f0 9f 98 8e

Date and Time encoding

UTCTime and GeneralizedTime are actually encoded like strings, surprisingly! As described above in the “Types” section, UTCTime represents dates in the format YYMMDDhhmmss. GeneralizedTime uses a four-digit year YYYY in place of YY. Both have an optional timezone offset or “Z” (Zulu) to indicate no timezone offset from UTC.

For instance, December 15, 2019 at 19:02:10 in the PST time zone (UTC-8) is represented in a UTCTime as: 191215190210-0800. Encoded in BER, that’s:

  1. 17 11 31 39 31 32 31 35 31 39 30 32 31 30 2d 30 38 30 30

For BER encoding, seconds are optional in both UTCTime and GeneralizedTime, and timezone offsets are allowed. However, DER (along with RFC 5280) specify that seconds must be present, fractional seconds must not be present, and the time must be expressed as UTC with the “Z” form.

The above date would be encoded in DER as:

  1. 17 0d 31 39 31 32 31 36 30 33 30 32 31 30 5a

OBJECT IDENTIFIER encoding

As described above, OIDs are conceptually a series of integers. They are always at least two components long. The first component is always 0, 1, or 2. When the first component is 0 or 1, the second component is always less than 40. Because of this, the first two components are unambiguously represented as 40*X+Y, where X is the first component and Y is the second.

So, for instance, to encode 2.999.3, you would combine the first two components into 1079 decimal (40*2 + 999), which would give you “1079.3”.

After applying that transform, each component is encoded in base 128, with the most significant byte first. Bit 8 is set to “1” in every byte except the last in a component; that’s how you know when one component is done and the next one begins. So the component “3” would be represented simply as the byte 0x03. The component “129” would be represented as the bytes 0x81 0x01. Once encoded, all the components of an OID are concatenated together to form the encoded value of the OID.

OIDs must be represented in the fewest bytes possible, whether in BER or DER. So components cannot begin with the byte 0x80.

As an example, the OID 1.2.840.113549.1.1.11 (representing sha256WithRSAEncryption) is encoded like so:

  1. 06 09 2a 86 48 86 f7 0d 01 01 0b

NULL encoding

The value of an object containing NULL is always zero-length, so the encoding of NULL is always just the tag and a length field of zero:

  1. 05 00

SEQUENCE encoding

The first thing to know about SEQUENCE is that it always uses Constructed encoding because it contains other objects. In other words, the value bytes of a SEQUENCE contain the concatenation of the encoded fields of that SEQUENCE (in the order those fields were defined). This also means that bit 6 of a SEQUENCE’s tag (the Constructed vs Primitive bit) is always set to 1. So even though the tag number for SEQUENCE is technically 0x10, its tag byte, once encoded, is always 0x30.

When there are fields in a SEQUENCE with the OPTIONAL annotation, they are simply omitted from the encoding if not present. As a decoder processes elements of the SEQUENCE, it can figure out which type is being decoded based on what’s been decoded so far, and the tag bytes it reads. If there is ambiguity, for instance when elements have the same type, the ASN.1 module must specify encoding instructions that assign distinct tag numbers to the elements.

DEFAULT fields are similar to OPTIONAL ones. If a field’s value is the default, it may be omitted from the BER encoding. In the DER encoding, it MUST be omitted.

As an example, RFC 5280 defines AlgorithmIdentifier as a SEQUENCE:

  1. AlgorithmIdentifier ::= SEQUENCE {
  2. algorithm OBJECT IDENTIFIER,
  3. parameters ANY DEFINED BY algorithm OPTIONAL }

Here’s the encoding of the AlgorithmIdentifier containing 1.2.840.113549.1.1.11. RFC 8017 says “parameters” should have the type NULL for this algorithm.

  1. 30 0d 06 09 2a 86 48 86 f7 0d 01 01 0b 05 00

SEQUENCE OF encoding

A SEQUENCE OF is encoded in exactly the same way as a SEQUENCE. It even uses the same tag! If you’re decoding, the only way you can tell the difference between a SEQUENCE and a SEQUENCE OF is by reference to the ASN.1 module.

Here is the encoding of a SEQUENCE OF INTEGER containing the numbers 7, 8, and 9:

  1. 30 09 02 01 07 02 01 08 02 01 09

SET encoding

Like SEQUENCE, a SET is Contructed, meaning that its value bytes are the concatenation of its encoded fields. Its tag number is 0x11. Since the Constructed vs Primitive bit (bit 6) is always set to 1, that means it’s encoded with a tag byte of 0x31.

The encoding of a SET, like a SEQUENCE, omits OPTIONAL and DEFAULT fields if they are absent or have the default value. Any ambiguity that results due to fields with the same type must be resolved by the ASN.1 module, and DEFAULT fields MUST be omitted from DER encoding if they have the default value.

In BER, a SET may be encoded in any order. In DER, a SET must be encoded in ascending order by tag.

SET OF encoding

A SET OF items is encoded the same way as a SET, including the tag byte of 0x31. For DER encoding, there is a similar requirement that the SET OF must be encoded in ascending order. Because all elements in the SET OF have the same type, ordering by tag is not sufficient. So the elements of a SET OF are sorted by their encoded values, with shorter values treated as if they were padded to the right with zeroes.

BIT STRING encoding

A BIT STRING of N bits is encoded as N/8 bytes (rounded up), with a one-byte prefix that contains the “number of unused bits,” for clarity when the number of bits is not a multiple of 8. For instance, when encoding the bit string 011011100101110111 (18 bits), we need at least three bytes. But that’s somewhat more than we need: it gives us capacity for 24 bits total. Six of those bits will be unused. Those six bits are written at the rightmost end of the bit string, so this is encoded as:

  1. 03 04 06 6e 5d c0

In BER, the unused bits can have any value, so the last byte of that encoding could just as well be c1, c2, c3, and so on. In DER, the unused bits must all be zero.

OCTET STRING encoding

An OCTET STRING is encoded as the bytes it contains. Here’s an example of an OCTET STRING containing the bytes 03, 02, 06, and A0:

  1. 04 04 03 02 06 A0

CHOICE and ANY encoding

A CHOICE or ANY field is encoded as whatever type it actually holds, unless modified by encoding instructions. So if a CHOICE field in an ASN.1 specification allows an INTEGER or a UTCTime, and the specific object being encoded contains an INTEGER, then it is encoded as an INTEGER.

In practice, CHOICE fields very often have encoding instructions. For instance, consider this example from RFC 5280, where the encoding instructions are necessary to distinguish rfc822Name from dNSName, since they both have the underlying type IA5String:

  1. GeneralName ::= CHOICE {
  2. otherName [0] OtherName,
  3. rfc822Name [1] IA5String,
  4. dNSName [2] IA5String,
  5. x400Address [3] ORAddress,
  6. directoryName [4] Name,
  7. ediPartyName [5] EDIPartyName,
  8. uniformResourceIdentifier [6] IA5String,
  9. iPAddress [7] OCTET STRING,
  10. registeredID [8] OBJECT IDENTIFIER }

Here’s an example encoding of a GeneralName containing the rfc822Name a@example.com (recalling that [1] means to use tag number 1, in the tag class “context-specific” (bit 8 set to 1), with the IMPLICIT tag encoding method):

  1. 81 0d 61 40 65 78 61 6d 70 6c 65 2e 63 6f 6d

Here’s an example encoding of a GeneralName containing the dNSName “example.com”:

  1. 82 0b 65 78 61 6d 70 6c 65 2e 63 6f 6d

Safety

It’s important to be very careful decoding BER and DER, particularly in non-memory-safe languages like C and C++. There’s a long history of vulnerabilities in decoders. Parsing input in general is a common source of vulnerabilities. The ASN.1 encoding formats in particular seem to be particular vulnerability magnets. They are complicated formats, with many variable-length fields. Even the lengths have variable lengths! Also, ASN.1 input is often attacker-controlled. If you have to parse a certificate in order to distinguish an authorized user from an unauthorized one, you have to assume that some of the time you will be parsing, not a certificate, but some bizarre input crafted to exploit bugs in your ASN.1 code.

To avoid these problems, it is best to use a memory-safe language whenever possible. And whether you can use a memory-safe language or not, it’s best to use an ASN.1 compiler to generate your parsing code rather than writing it from scratch.

Acknowledgements

I owe a significant debt to A Layman’s Guide to a Subset of ASN.1, DER, and BER, which is a big part of how I learned these topics. I’d also like to thank the authors of A warm welcome to DNS, which is a great read and inspired the tone of this document.

A Little Bonus

Have you ever noticed that a PEM-encoded certificate always starts with “MII”? For instance:

  1. -----BEGIN CERTIFICATE-----
  2. MIIFajCCBFKgAwIBAgISA6HJW9qjaoJoMn8iU8vTuiQ2MA0GCSqGSIb3DQEBCwUA
  3. ...

Now you know enough to explain why! A Certificate is a SEQUENCE, so it will start with the byte 0x30. The next bytes are the length field. Certificates are almost always more than 127 bytes, so the length field has to use the long form of the length. That means the first byte will be 0x80 + N, where N is the number of length bytes to follow. N is almost always 2, since that’s how many bytes it takes to encode lengths from 128 to 65535, and almost all certificates have lengths in that range.

So now we know that the first two bytes of the DER encoding of a certificate are 0x30 0x82. PEM encoding uses base64, which encodes 3 bytes of binary input into 4 ASCII characters of output. Or, to put it differently: base64 turns 24 bits of binary input into 4 ASCII characters of output, with 6 bits of the input assigned to each character. We know what the first 16 bits of every certificate will be. To prove that the first characters of (almost) every certificate will be “MII”, we need two to look at the next 2 bits. Those will be the most significant bits of the most significant byte of the two length bytes. Will those bits ever be set to 1? Not unless the certificate is more than 16,383 bytes long! So we can predict that the first characters of a PEM certificate will always be the same. Try it yourself:

  1. xxd -r -p <<<308200 | base64