Assembly

For some use cases, it may be necessary to directly control the machine code generated by Zig programs, rather than relying on Zig’s code generation. For these cases, one can use inline assembly. Here is an example of implementing Hello, World on x86_64 Linux using inline assembly:

inline_assembly.zig

  1. pub fn main() noreturn {
  2. const msg = "hello world\n";
  3. _ = syscall3(SYS_write, STDOUT_FILENO, @intFromPtr(msg), msg.len);
  4. _ = syscall1(SYS_exit, 0);
  5. unreachable;
  6. }
  7. pub const SYS_write = 1;
  8. pub const SYS_exit = 60;
  9. pub const STDOUT_FILENO = 1;
  10. pub fn syscall1(number: usize, arg1: usize) usize {
  11. return asm volatile ("syscall"
  12. : [ret] "={rax}" (-> usize),
  13. : [number] "{rax}" (number),
  14. [arg1] "{rdi}" (arg1),
  15. : "rcx", "r11"
  16. );
  17. }
  18. pub fn syscall3(number: usize, arg1: usize, arg2: usize, arg3: usize) usize {
  19. return asm volatile ("syscall"
  20. : [ret] "={rax}" (-> usize),
  21. : [number] "{rax}" (number),
  22. [arg1] "{rdi}" (arg1),
  23. [arg2] "{rsi}" (arg2),
  24. [arg3] "{rdx}" (arg3),
  25. : "rcx", "r11"
  26. );
  27. }

Shell

  1. $ zig build-exe inline_assembly.zig -target x86_64-linux
  2. $ ./inline_assembly
  3. hello world

Dissecting the syntax:

Assembly Syntax Explained.zig

  1. pub fn syscall1(number: usize, arg1: usize) usize {
  2. // Inline assembly is an expression which returns a value.
  3. // the `asm` keyword begins the expression.
  4. return asm
  5. // `volatile` is an optional modifier that tells Zig this
  6. // inline assembly expression has side-effects. Without
  7. // `volatile`, Zig is allowed to delete the inline assembly
  8. // code if the result is unused.
  9. volatile (
  10. // Next is a comptime string which is the assembly code.
  11. // Inside this string one may use `%[ret]`, `%[number]`,
  12. // or `%[arg1]` where a register is expected, to specify
  13. // the register that Zig uses for the argument or return value,
  14. // if the register constraint strings are used. However in
  15. // the below code, this is not used. A literal `%` can be
  16. // obtained by escaping it with a double percent: `%%`.
  17. // Often multiline string syntax comes in handy here.
  18. \\syscall
  19. // Next is the output. It is possible in the future Zig will
  20. // support multiple outputs, depending on how
  21. // https://github.com/ziglang/zig/issues/215 is resolved.
  22. // It is allowed for there to be no outputs, in which case
  23. // this colon would be directly followed by the colon for the inputs.
  24. :
  25. // This specifies the name to be used in `%[ret]` syntax in
  26. // the above assembly string. This example does not use it,
  27. // but the syntax is mandatory.
  28. [ret]
  29. // Next is the output constraint string. This feature is still
  30. // considered unstable in Zig, and so LLVM/GCC documentation
  31. // must be used to understand the semantics.
  32. // http://releases.llvm.org/10.0.0/docs/LangRef.html#inline-asm-constraint-string
  33. // https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html
  34. // In this example, the constraint string means "the result value of
  35. // this inline assembly instruction is whatever is in $rax".
  36. "={rax}"
  37. // Next is either a value binding, or `->` and then a type. The
  38. // type is the result type of the inline assembly expression.
  39. // If it is a value binding, then `%[ret]` syntax would be used
  40. // to refer to the register bound to the value.
  41. (-> usize),
  42. // Next is the list of inputs.
  43. // The constraint for these inputs means, "when the assembly code is
  44. // executed, $rax shall have the value of `number` and $rdi shall have
  45. // the value of `arg1`". Any number of input parameters is allowed,
  46. // including none.
  47. : [number] "{rax}" (number),
  48. [arg1] "{rdi}" (arg1),
  49. // Next is the list of clobbers. These declare a set of registers whose
  50. // values will not be preserved by the execution of this assembly code.
  51. // These do not include output or input registers. The special clobber
  52. // value of "memory" means that the assembly writes to arbitrary undeclared
  53. // memory locations - not only the memory pointed to by a declared indirect
  54. // output. In this example we list $rcx and $r11 because it is known the
  55. // kernel syscall does not preserve these registers.
  56. : "rcx", "r11"
  57. );
  58. }

For x86 and x86_64 targets, the syntax is AT&T syntax, rather than the more popular Intel syntax. This is due to technical constraints; assembly parsing is provided by LLVM and its support for Intel syntax is buggy and not well tested.

Some day Zig may have its own assembler. This would allow it to integrate more seamlessly into the language, as well as be compatible with the popular NASM syntax. This documentation section will be updated before 1.0.0 is released, with a conclusive statement about the status of AT&T vs Intel/NASM syntax.

Output Constraints

Output constraints are still considered to be unstable in Zig, and so LLVM documentation and GCC documentation must be used to understand the semantics.

Note that some breaking changes to output constraints are planned with issue #215.

Input Constraints

Input constraints are still considered to be unstable in Zig, and so LLVM documentation and GCC documentation must be used to understand the semantics.

Note that some breaking changes to input constraints are planned with issue #215.

Clobbers

Clobbers are the set of registers whose values will not be preserved by the execution of the assembly code. These do not include output or input registers. The special clobber value of "memory" means that the assembly causes writes to arbitrary undeclared memory locations - not only the memory pointed to by a declared indirect output.

Failure to declare the full set of clobbers for a given inline assembly expression is unchecked Undefined Behavior.

Global Assembly

When an assembly expression occurs in a container level comptime block, this is global assembly.

This kind of assembly has different rules than inline assembly. First, volatile is not valid because all global assembly is unconditionally included. Second, there are no inputs, outputs, or clobbers. All global assembly is concatenated verbatim into one long string and assembled together. There are no template substitution rules regarding % as there are in inline assembly expressions.

test_global_assembly.zig

  1. const std = @import("std");
  2. const expect = std.testing.expect;
  3. comptime {
  4. asm (
  5. \\.global my_func;
  6. \\.type my_func, @function;
  7. \\my_func:
  8. \\ lea (%rdi,%rsi,1),%eax
  9. \\ retq
  10. );
  11. }
  12. extern fn my_func(a: i32, b: i32) i32;
  13. test "global assembly" {
  14. try expect(my_func(12, 34) == 46);
  15. }

Shell

  1. $ zig test test_global_assembly.zig -target x86_64-linux
  2. 1/1 test.global assembly... OK
  3. All 1 tests passed.