Type inference

Crystal’s philosophy is to require as few type restrictions as possible. However, some restrictions are required.

Consider a class definition like this:

  1. class Person
  2. def initialize(@name)
  3. @age = 0
  4. end
  5. end

We can quickly see that @age is an integer, but we don’t know the type of @name. The compiler could infer its type from all uses of the Person class. However, doing so has a few issues:

  • The type is not obvious for a human reading the code: they would also have to check all uses of Person to find this out.
  • Some compiler optimizations, like having to analyze a method just once, and incremental compilation, are nearly impossible to do.

As a code base grows, these issues gain more relevance: understanding a project becomes harder, and compile times become unbearable.

For this reason, Crystal needs to know, in an obvious way (as obvious as to a human), the types of instance and class variables.

There are several ways to let Crystal know this.

With type restrictions

The easiest, but probably most tedious, way is to use explicit type restrictions.

  1. class Person
  2. @name : String
  3. @age : Int32
  4. def initialize(@name)
  5. @age = 0
  6. end
  7. end

Without type restrictions

If you omit an explicit type restriction, the compiler will try to infer the type of instance and class variables using a bunch of syntactic rules.

For a given instance/class variable, when a rule can be applied and a type can be guessed, the type is added to a set. When no more rules can be applied, the inferred type will be the union of those types. Additionally, if the compiler infers that an instance variable isn’t always initialized, it will also include the Nil type.

The rules are many, but usually the first three are most used. There’s no need to remember them all. If the compiler gives an error saying that the type of an instance variable can’t be inferred you can always add an explicit type restriction.

The following rules only mention instance variables, but they apply to class variables as well. They are:

1. Assigning a literal value

When a literal is assigned to an instance variable, the literal’s type is added to the set. All literals have an associated type.

In the following example, @name is inferred to be String and @age to be Int32.

  1. class Person
  2. def initialize
  3. @name = "John Doe"
  4. @age = 0
  5. end
  6. end

This rule, and every following rule, will also be applied in methods other than initialize. For example:

  1. class SomeObject
  2. def lucky_number
  3. @lucky_number = 42
  4. end
  5. end

In the above case, @lucky_number will be inferred to be Int32 | Nil: Int32 because 42 was assigned to it, and Nil because it wasn’t assigned in all of the class’ initialize methods.

2. Assigning the result of invoking the class method new

When an expression like Type.new(...) is assigned to an instance variable, the type Type is added to the set.

In the following example, @address is inferred to be Address.

  1. class Person
  2. def initialize
  3. @address = Address.new("somewhere")
  4. end
  5. end

This also is applied to generic types. Here @values is inferred to be Array(Int32).

  1. class Something
  2. def initialize
  3. @values = Array(Int32).new
  4. end
  5. end

Note: a new method might be redefined by a type. In that case the inferred type will be the one returned by new, if it can be inferred using some of the next rules.

3. Assigning a variable that is a method argument with a type restriction

In the following example @name is inferred to be String because the method argument name has a type restriction of type String, and that argument is assigned to @name.

  1. class Person
  2. def initialize(name : String)
  3. @name = name
  4. end
  5. end

Note that the name of the method argument is not important; this works as well:

  1. class Person
  2. def initialize(obj : String)
  3. @name = obj
  4. end
  5. end

Using the shorter syntax to assign an instance variable from a method argument has the same effect:

  1. class Person
  2. def initialize(@name : String)
  3. end
  4. end

Also note that the compiler doesn’t check whether a method argument is reassigned a different value:

  1. class Person
  2. def initialize(name : String)
  3. name = 1
  4. @name = name
  5. end
  6. end

In the above case, the compiler will still infer @name to be String, and later will give a compile time error, when fully typing that method, saying that Int32 can’t be assigned to a variable of type String. Use an explicit type restriction if @name isn’t supposed to be a String.

4. Assigning the result of a class method that has a return type restriction

In the following example, @address is inferred to be Address, because the class method Address.unknown has a return type restriction of Address.

  1. class Person
  2. def initialize
  3. @address = Address.unknown
  4. end
  5. end
  6. class Address
  7. def self.unknown : Address
  8. new("unknown")
  9. end
  10. def initialize(@name : String)
  11. end
  12. end

In fact, the above code doesn’t need the return type restriction in self.unknown. The reason is that the compiler will also look at a class method’s body and if it can apply one of the previous rules (it’s a new method, or it’s a literal, etc.) it will infer the type from that expression. So, the above can be simply written like this:

  1. class Person
  2. def initialize
  3. @address = Address.unknown
  4. end
  5. end
  6. class Address
  7. # No need for a return type restriction here
  8. def self.unknown
  9. new("unknown")
  10. end
  11. def initialize(@name : String)
  12. end
  13. end

This extra rule is very convenient because it’s very common to have “constructor-like” class methods in addition to new.

5. Assigning a variable that is a method argument with a default value

In the following example, because the default value of name is a string literal, and it’s later assigned to @name, String will be added to the set of inferred types.

  1. class Person
  2. def initialize(name = "John Doe")
  3. @name = name
  4. end
  5. end

This of course also works with the short syntax:

  1. class Person
  2. def initialize(@name = "John Doe")
  3. end
  4. end

The default value can also be a Type.new(...) method or a class method with a return type restriction.

6. Assigning the result of invoking a lib function

Because a lib function must have explicit types, the compiler can use the return type when assigning it to an instance variable.

In the following example @age is inferred to be Int32.

  1. class Person
  2. def initialize
  3. @age = LibPerson.compute_default_age
  4. end
  5. end
  6. lib LibPerson
  7. fun compute_default_age : Int32
  8. end

7. Using an out lib expression

Because a lib function must have explicit types, the compiler can use the out argument’s type, which should be a pointer type, and use the dereferenced type as a guess.

In the following example @age is inferred to be Int32.

  1. class Person
  2. def initialize
  3. LibPerson.compute_default_age(out @age)
  4. end
  5. end
  6. lib LibPerson
  7. fun compute_default_age(age_ptr : Int32*)
  8. end

Other rules

The compiler will try to be as smart as possible to require less explicit type restrictions. For example, if assigning an if expression, type will be inferred from the then and else branches:

  1. class Person
  2. def initialize
  3. @age = some_condition ? 1 : 2
  4. end
  5. end

Because the if above (well, technically a ternary operator, but it’s similar to an if) has integer literals, @age is successfully inferred to be Int32 without requiring a redundant type restriction.

Another case is || and ||=:

  1. class SomeObject
  2. def lucky_number
  3. @lucky_number ||= 42
  4. end
  5. end

In the above example @lucky_number will be inferred to be Int32 | Nil. This is very useful for lazily initialized variables.

Constants will also be followed, as it’s pretty simple for the compiler (and a human) to do so.

  1. class SomeObject
  2. DEFAULT_LUCKY_NUMBER = 42
  3. def initialize(@lucky_number = DEFAULT_LUCKY_NUMBER)
  4. end
  5. end

Here rule 5 (argument’s default value) is used, and because the constant resolves to an integer literal, @lucky_number is inferred to be Int32.