Lucene expressions language
Lucene expressions language
Lucene’s expressions compile a javascript
expression to bytecode. They are designed for high-performance custom ranking and sorting functions and are enabled for inline
and stored
scripting by default.
Performance
Expressions were designed to have competitive performance with custom Lucene code. This performance is due to having low per-document overhead as opposed to other scripting engines: expressions do more “up-front”.
This allows for very fast execution, even faster than if you had written a native
script.
Syntax
Expressions support a subset of javascript syntax: a single expression.
See the expressions module documentation for details on what operators and functions are available.
Variables in expression
scripts are available to access:
- document fields, e.g.
doc['myfield'].value
- variables and methods that the field supports, e.g.
doc['myfield'].empty
- Parameters passed into the script, e.g.
mymodifier
- The current document’s score,
_score
(only available when used in ascript_score
)
You can use Expressions scripts for script_score
, script_fields
, sort scripts, and numeric aggregation scripts, simply set the lang
parameter to expression
.
Numeric field API
Expression | Description |
---|---|
| The value of the field, as a |
| A boolean indicating if the field has no values within the doc. |
| The number of values in this document. |
| The minimum value of the field in this document. |
| The maximum value of the field in this document. |
| The median value of the field in this document. |
| The average of the values in this document. |
| The sum of the values in this document. |
When a document is missing the field completely, by default the value will be treated as 0
. You can treat it as another value instead, e.g. doc['myfield'].empty ? 100 : doc['myfield'].value
When a document has multiple values for the field, by default the minimum value is returned. You can choose a different value instead, e.g. doc['myfield'].sum()
.
When a document is missing the field completely, by default the value will be treated as 0
.
Boolean fields are exposed as numerics, with true
mapped to 1
and false
mapped to 0
. For example: doc['on_sale'].value ? doc['price'].value * 0.5 : doc['price'].value
Date field API
Date fields are treated as the number of milliseconds since January 1, 1970 and support the Numeric Fields API above, plus access to some date-specific fields:
Expression | Description |
---|---|
| Century (1-2920000) |
| Day (1-31), e.g. |
| Day of the week (1-7), e.g. |
| Day of the year, e.g. |
| Era: |
| Hour (0-23). |
| Milliseconds within the day (0-86399999). |
| Milliseconds within the second (0-999). |
| Minute within the day (0-1439). |
| Minute within the hour (0-59). |
| Month within the year (1-12), e.g. |
| Second within the day (0-86399). |
| Second within the minute (0-59). |
| Year (-292000000 - 292000000). |
| Year within the century (1-100). |
| Year within the era (1-292000000). |
The following example shows the difference in years between the date
fields date0 and date1:
doc['date1'].date.year - doc['date0'].date.year
geo_point
field API
Expression | Description |
---|---|
| A boolean indicating if the field has no values within the doc. |
| The latitude of the geo point. |
| The longitude of the geo point. |
The following example computes distance in kilometers from Washington, DC:
haversin(38.9072, 77.0369, doc['field_name'].lat, doc['field_name'].lon)
In this example the coordinates could have been passed as parameters to the script, e.g. based on geolocation of the user.
Limitations
There are a few limitations relative to other script languages:
- Only numeric,
boolean
,date
, andgeo_point
fields may be accessed - Stored fields are not available