Numeric literals

Numeric literals have the form:

  1. hexdigit = digit | 'A'..'F' | 'a'..'f'
  2. octdigit = '0'..'7'
  3. bindigit = '0'..'1'
  4. unary_minus = '-' # See the section about unary minus
  5. HEX_LIT = unary_minus? '0' ('x' | 'X' ) hexdigit ( ['_'] hexdigit )*
  6. DEC_LIT = unary_minus? digit ( ['_'] digit )*
  7. OCT_LIT = unary_minus? '0' 'o' octdigit ( ['_'] octdigit )*
  8. BIN_LIT = unary_minus? '0' ('b' | 'B' ) bindigit ( ['_'] bindigit )*
  9. INT_LIT = HEX_LIT
  10. | DEC_LIT
  11. | OCT_LIT
  12. | BIN_LIT
  13. INT8_LIT = INT_LIT ['\''] ('i' | 'I') '8'
  14. INT16_LIT = INT_LIT ['\''] ('i' | 'I') '16'
  15. INT32_LIT = INT_LIT ['\''] ('i' | 'I') '32'
  16. INT64_LIT = INT_LIT ['\''] ('i' | 'I') '64'
  17. UINT_LIT = INT_LIT ['\''] ('u' | 'U')
  18. UINT8_LIT = INT_LIT ['\''] ('u' | 'U') '8'
  19. UINT16_LIT = INT_LIT ['\''] ('u' | 'U') '16'
  20. UINT32_LIT = INT_LIT ['\''] ('u' | 'U') '32'
  21. UINT64_LIT = INT_LIT ['\''] ('u' | 'U') '64'
  22. exponent = ('e' | 'E' ) ['+' | '-'] digit ( ['_'] digit )*
  23. FLOAT_LIT = unary_minus? digit (['_'] digit)* (('.' digit (['_'] digit)* [exponent]) |exponent)
  24. FLOAT32_SUFFIX = ('f' | 'F') ['32']
  25. FLOAT32_LIT = HEX_LIT '\'' FLOAT32_SUFFIX
  26. | (FLOAT_LIT | DEC_LIT | OCT_LIT | BIN_LIT) ['\''] FLOAT32_SUFFIX
  27. FLOAT64_SUFFIX = ( ('f' | 'F') '64' ) | 'd' | 'D'
  28. FLOAT64_LIT = HEX_LIT '\'' FLOAT64_SUFFIX
  29. | (FLOAT_LIT | DEC_LIT | OCT_LIT | BIN_LIT) ['\''] FLOAT64_SUFFIX
  30. CUSTOM_NUMERIC_LIT = (FLOAT_LIT | INT_LIT) '\'' CUSTOM_NUMERIC_SUFFIX
  31. # CUSTOM_NUMERIC_SUFFIX is any Nim identifier that is not
  32. # a pre-defined type suffix.

As can be seen in the productions, numeric literals can contain underscores for readability. Integer and floating-point literals may be given in decimal (no prefix), binary (prefix 0b), octal (prefix 0o), and hexadecimal (prefix 0x) notation.

The fact that the unary minus - in a number literal like -1 is considered to be part of the literal is a late addition to the language. The rationale is that an expression -128’i8 should be valid and without this special case, this would be impossible — 128 is not a valid int8 value, only -128 is.

For the unary_minus rule there are further restrictions that are not covered in the formal grammar. For - to be part of the number literal the immediately preceding character has to be in the set {‘ ‘, ‘\t’, ‘\n’, ‘\r’, ‘,’, ‘;’, ‘(‘, ‘[‘, ‘{‘}. This set was designed to cover most cases in a natural manner.

In the following examples, -1 is a single token:

  1. echo -1
  2. echo(-1)
  3. echo [-1]
  4. echo 3,-1
  5. "abc";-1

In the following examples, -1 is parsed as two separate tokens (as - 1):

  1. echo x-1
  2. echo (int)-1
  3. echo [a]-1
  4. "abc"-1

The suffix starting with an apostrophe (‘’’) is called a type suffix. Literals without a type suffix are of an integer type unless the literal contains a dot or E|e in which case it is of type float. This integer type is int if the literal is in the range low(int32)..high(int32), otherwise it is int64. For notational convenience, the apostrophe of a type suffix is optional if it is not ambiguous (only hexadecimal floating-point literals with a type suffix can be ambiguous).

The pre-defined type suffixes are:

Type SuffixResulting type of literal
‘i8int8
‘i16int16
‘i32int32
‘i64int64
‘uuint
‘u8uint8
‘u16uint16
‘u32uint32
‘u64uint64
‘ffloat32
‘dfloat64
‘f32float32
‘f64float64

Floating-point literals may also be in binary, octal or hexadecimal notation: 0B0_10001110100_0000101001000111101011101111111011000101001101001001’f64 is approximately 1.72826e35 according to the IEEE floating-point standard.

Literals must match the datatype, for example, 333’i8 is an invalid literal. Non-base-10 literals are used mainly for flags and bit pattern representations, therefore the checking is done on bit width and not on value range. Hence: 0b10000000’u8 == 0x80’u8 == 128, but, 0b10000000’i8 == 0x80’i8 == -1 instead of causing an overflow error.

Custom numeric literals

If the suffix is not predefined, then the suffix is assumed to be a call to a proc, template, macro or other callable identifier that is passed the string containing the literal. The callable identifier needs to be declared with a special ‘ prefix:

  1. import strutils
  2. type u4 = distinct uint8 # a 4-bit unsigned integer aka "nibble"
  3. proc `'u4`(n: string): u4 =
  4. # The leading ' is required.
  5. result = (parseInt(n) and 0x0F).u4
  6. var x = 5'u4

More formally, a custom numeric literal 123’custom is transformed to r”123”.’custom in the parsing step. There is no AST node kind that corresponds to this transformation. The transformation naturally handles the case that additional parameters are passed to the callee:

  1. import strutils
  2. type u4 = distinct uint8 # a 4-bit unsigned integer aka "nibble"
  3. proc `'u4`(n: string; moreData: int): u4 =
  4. result = (parseInt(n) and 0x0F).u4
  5. var x = 5'u4(123)

Custom numeric literals are covered by the grammar rule named CUSTOM_NUMERIC_LIT. A custom numeric literal is a single token.