Lexical Structure

Agda code is written in UTF-8 encoded plain text files with the extension .agda. Most unicode characters can be used in identifiers and whitespace is important, see Names and Layout below.

Tokens

Keywords and special symbols

Most non-whitespace unicode can be used as part of an Agda name, but there are two kinds of exceptions:

special symbols

Characters with special meaning that cannot appear at all in a name. These are .;{}()@".

keywords

Reserved words that cannot appear as a name part, but can appear in a name together with other characters.

= | -> : ? \ λ .. ... abstract coinductive constructor data do eta-equality field forall import in inductive infix infixl infixr instance interleaved let macro module mutual no-eta-equality open overlap pattern postulate primitive private quote quoteTerm record rewrite Set syntax tactic unquote unquoteDecl unquoteDef variable where with

The Set keyword can appear with a natural number suffix, optionally subscripted (see Sort System). For instance Set42 and Set₄₂ are both keywords.

keywords in import directives

The following words are only reserved in import directives (in connection with import or open):

public using hiding renaming to

Names

A qualified name is a non-empty sequence of names separated by dots (.). A name is an alternating sequence of name parts and underscores (_), containing at least one name part. A name part is a non-empty sequence of unicode characters, excluding whitespace, _, and special symbols. A name part cannot be one of the keywords above, and cannot start with a single quote, ' (which are used for character literals, see Literals below).

Examples

  • Valid: data?, ::, if_then_else_, 0b, _⊢_∈_, x=y

  • Invalid: data_?, foo__bar, _, a;b, [_.._]

The underscores in a name indicate where the arguments go when the name is used as an operator. For instance, the application _+_ 1 2 can be written as 1 + 2. See Mixfix Operators for more information. Since most sequences of characters are valid names, whitespace is more important than in other languages. In the example above the whitespace around + is required, since 1+2 is a valid name.

Qualified names are used to refer to entities defined in other modules. For instance Prelude.Bool.true refers to the name true defined in the module Prelude.Bool. See Module System for more information.

Literals

There are four types of literal values: integers, floats, characters, and strings. See Built-ins for the corresponding types, and Literal Overloading for how to support literals for user-defined types.

Integers

Integer values in decimal, hexadecimal (prefixed by 0x), or binary (prefixed by 0b) notation. The character _ can be used to separate groups of digits. Non-negative numbers map by default to built-in natural numbers, but can be overloaded. Negative numbers have no default interpretation and can only be used through overloading.

Examples: 123, 0xF0F080, -42, -0xF, 0b11001001, 1_000_000_000, 0b01001000_01001001.

Floats

Floating point numbers in the standard notation (with square brackets denoting optional parts):

  1. float ::= [-] decimal . decimal [exponent]
  2. | [-] decimal exponent
  3. exponent ::= (e | E) [+ | -] decimal

These map to built-in floats and cannot be overloaded.

Examples: 1.0, -5.0e+12, 1.01e-16, 4.2E9, 50e3.

Characters

Character literals are enclosed in single quotes ('). They can be a single (unicode) character, other than ' or \, or an escaped character. Escaped characters start with a backslash \ followed by an escape code. Escape codes are natural numbers in decimal or hexadecimal (prefixed by x) between 0 and 0x10ffff (1114111), or one of the following special escape codes:

Code

ASCII

Code

ASCII

Code

ASCII

Code

ASCII

a

7

b

8

t

9

n

10

v

11

f

12

\

\

NUL

0

SOH

1

STX

2

ETX

3

EOT

4

ENQ

5

ACK

6

BEL

7

BS

8

HT

9

LF

10

VT

11

FF

12

CR

13

SO

14

SI

15

DLE

16

DC1

17

DC2

18

DC3

19

DC4

20

NAK

21

SYN

22

ETB

23

CAN

24

EM

25

SUB

26

ESC

27

FS

28

GS

29

RS

30

US

31

SP

32

DEL

127

Character literals map to the built-in character type and cannot be overloaded.

Examples: 'A', '∀', '\x2200', '\ESC', '\32', '\n', '\'', '"'.

Strings

String literals are sequences of, possibly escaped, characters enclosed in double quotes ". They follow the same rules as character literals except that double quotes " need to be escaped rather than single quotes '. String literals map to the built-in string type by default, but can be overloaded.

Example: "Привет \"мир\"\n".

Holes

Holes are an integral part of the interactive development supported by the Emacs mode. Any text enclosed in {! and !} is a hole and may contain nested holes. A hole with no contents can be written ?. There are a number of Emacs commands that operate on the contents of a hole. The type checker ignores the contents of a hole and treats it as an unknown (see Implicit Arguments).

Example: {! f {!x!} 5 !}

Comments

Single-line comments are written with a double dash -- followed by arbitrary text. Multi-line comments are enclosed in {- and -} and can be nested. Comments cannot appear in string literals.

Example:

  1. {- Here is a {- nested -}
  2. comment -}
  3. s : String --line comment {-
  4. s = "{- not a comment -}"

Pragmas

Pragmas are special comments enclosed in {-# and #-} that have special meaning to the system. See Pragmas for a full list of pragmas.

Layout

Agda is layout sensitive using similar rules as Haskell, with the exception that layout is mandatory: you cannot use explicit {, } and ; to avoid it.

A layout block contains a sequence of statements and is started by one of the layout keywords:

  1. abstract
  2. constructor
  3. do
  4. field
  5. instance
  6. let
  7. macro
  8. mutual
  9. postulate
  10. primitive
  11. private
  12. variable
  13. where

The first token after the layout keyword decides the indentation of the block. Any token indented more than this is part of the previous statement, a token at the same level starts a new statement, and a token indented less lies outside the block.

  1. data Nat : Set where -- starts a layout block
  2. -- comments are not tokens
  3. zero : Nat -- statement 1
  4. suc : Nat -- statement 2
  5. Nat -- also statement 2
  6. one : Nat -- outside the layout block
  7. one = suc zero

Note that the indentation of the layout keyword does not matter.

If several layout blocks are started by layout keywords without line break in between (where line breaks inside block comments do not count), then those blocks indented more than the last block go passive, meaning they cannot be further extended by new statements:

  1. private module M where postulate
  2. A : Set -- module-block goes passive
  3. B : Set -- postulate-block can still be extended
  4. module N where -- private-block can still be extended

An Agda file contains one top-level layout block, with the special rule that the contents of the top-level module need not be indented.

  1. module Example where
  2. NotIndented : Set
  3. NotIndented = Set

Literate Agda

Agda supports literate programming with multiple typesetting tools like LaTeX, Markdown and reStructuredText. For instance, with LaTeX, everything in a file is a comment unless enclosed in \begin{code}, \end{code}. Literate Agda files have special file extensions, like .lagda and .lagda.tex for LaTeX, .lagda.md for Markdown, .lagda.rst for reStructuredText instead of .agda. The main use case for literate Agda is to generate LaTeX documents from Agda code. See Generating HTML and Generating LaTeX for more information.

  1. \documentclass{article}
  2. % some preamble stuff
  3. \begin{document}
  4. Introduction usually goes here
  5. \begin{code}
  6. module MyPaper where
  7. open import Prelude
  8. five : Nat
  9. five = 2 + 3
  10. \end{code}
  11. Now, conclusions!
  12. \end{document}