Lucene expressions language

Lucene expressions language

Lucene’s expressions compile a javascript expression to bytecode. They are designed for high-performance custom ranking and sorting functions and are enabled for inline and stored scripting by default.

Performance

Expressions were designed to have competitive performance with custom Lucene code. This performance is due to having low per-document overhead as opposed to other scripting engines: expressions do more “up-front”.

This allows for very fast execution, even faster than if you had written a native script.

Syntax

Expressions support a subset of javascript syntax: a single expression.

See the expressions module documentation for details on what operators and functions are available.

Variables in expression scripts are available to access:

  • document fields, e.g. doc['myfield'].value
  • variables and methods that the field supports, e.g. doc['myfield'].empty
  • Parameters passed into the script, e.g. mymodifier
  • The current document’s score, _score (only available when used in a script_score)

You can use Expressions scripts for script_score, script_fields, sort scripts, and numeric aggregation scripts, simply set the lang parameter to expression.

Numeric field API

ExpressionDescription

doc[‘field_name’].value

The value of the field, as a double

doc[‘field_name’].empty

A boolean indicating if the field has no values within the doc.

doc[‘field_name’].length

The number of values in this document.

doc[‘field_name’].min()

The minimum value of the field in this document.

doc[‘field_name’].max()

The maximum value of the field in this document.

doc[‘field_name’].median()

The median value of the field in this document.

doc[‘field_name’].avg()

The average of the values in this document.

doc[‘field_name’].sum()

The sum of the values in this document.

When a document is missing the field completely, by default the value will be treated as 0. You can treat it as another value instead, e.g. doc['myfield'].empty ? 100 : doc['myfield'].value

When a document has multiple values for the field, by default the minimum value is returned. You can choose a different value instead, e.g. doc['myfield'].sum().

When a document is missing the field completely, by default the value will be treated as 0.

Boolean fields are exposed as numerics, with true mapped to 1 and false mapped to 0. For example: doc['on_sale'].value ? doc['price'].value * 0.5 : doc['price'].value

Date field API

Date fields are treated as the number of milliseconds since January 1, 1970 and support the Numeric Fields API above, plus access to some date-specific fields:

ExpressionDescription

doc[‘field_name’].date.centuryOfEra

Century (1-2920000)

doc[‘field_name’].date.dayOfMonth

Day (1-31), e.g. 1 for the first of the month.

doc[‘field_name’].date.dayOfWeek

Day of the week (1-7), e.g. 1 for Monday.

doc[‘field_name’].date.dayOfYear

Day of the year, e.g. 1 for January 1.

doc[‘field_name’].date.era

Era: 0 for BC, 1 for AD.

doc[‘field_name’].date.hourOfDay

Hour (0-23).

doc[‘field_name’].date.millisOfDay

Milliseconds within the day (0-86399999).

doc[‘field_name’].date.millisOfSecond

Milliseconds within the second (0-999).

doc[‘field_name’].date.minuteOfDay

Minute within the day (0-1439).

doc[‘field_name’].date.minuteOfHour

Minute within the hour (0-59).

doc[‘field_name’].date.monthOfYear

Month within the year (1-12), e.g. 1 for January.

doc[‘field_name’].date.secondOfDay

Second within the day (0-86399).

doc[‘field_name’].date.secondOfMinute

Second within the minute (0-59).

doc[‘field_name’].date.year

Year (-292000000 - 292000000).

doc[‘field_name’].date.yearOfCentury

Year within the century (1-100).

doc[‘field_name’].date.yearOfEra

Year within the era (1-292000000).

The following example shows the difference in years between the date fields date0 and date1:

doc['date1'].date.year - doc['date0'].date.year

geo_point field API

ExpressionDescription

doc[‘field_name’].empty

A boolean indicating if the field has no values within the doc.

doc[‘field_name’].lat

The latitude of the geo point.

doc[‘field_name’].lon

The longitude of the geo point.

The following example computes distance in kilometers from Washington, DC:

haversin(38.9072, 77.0369, doc['field_name'].lat, doc['field_name'].lon)

In this example the coordinates could have been passed as parameters to the script, e.g. based on geolocation of the user.

Limitations

There are a few limitations relative to other script languages:

  • Only numeric, boolean, date, and geo_point fields may be accessed
  • Stored fields are not available