Floats
- Float Literals
- Floating Point Operations

Floats

Zig has the following floating point types:

f16 - IEEE-754-2008 binary16
f32 - IEEE-754-2008 binary32
f64 - IEEE-754-2008 binary64
f128 - IEEE-754-2008 binary128
c_longdouble - matches long double for the target C ABI

Float Literals

Float literals have type comptime_float which is guaranteed to have the same precision and operations of the largest other floating point type, which is f128.

Float literals implicitly cast to any floating point type, and to any integer type when there is no fractional component.

const floating_point = 123.0E+77;
const another_float = 123.0;
const yet_another = 123.0e+77;
const hex_floating_point = 0x103.70p-5;
const another_hex_float = 0x103.70;
const yet_another_hex_float = 0x103.70P-5;

There is no syntax for NaN, infinity, or negative infinity. For these special values, one must use the standard library:

const std = @import("std");
const inf = std.math.inf(f32);
const negative_inf = -std.math.inf(f64);
const nan = std.math.nan(f128);

Floating Point Operations

By default floating point operations use Strict mode, but you can switch to Optimized mode on a per-block basis:

foo.zig

const builtin = @import("builtin");
const big = f64(1 << 40);
export fn foo_strict(x: f64) f64 {
    return x + big - big;
}
export fn foo_optimized(x: f64) f64 {
    @setFloatMode(builtin.FloatMode.Optimized);
    return x + big - big;
}

$ zig build-obj foo.zig --release-fast

For this test we have to separate code into two object files - otherwise the optimizer figures out all the values at compile-time, which operates in strict mode.

float_mode.zig

const warn = @import("std").debug.warn;
extern fn foo_strict(x: f64) f64;
extern fn foo_optimized(x: f64) f64;
pub fn main() void {
    const x = 0.001;
    warn("optimized = {}\n", foo_optimized(x));
    warn("strict = {}\n", foo_strict(x));
}

$ zig build-exe float_mode.zig --object foo.o
$ ./float_mode
optimized = 1.0e-03
strict = 9.765625e-04