Range Queries with ArangoSearch
Match values that are above, below or between a minimum and a maximum value
The primary use case for range queries is to search numeric values in documents that are
- greater than (exclusive),
- greater than or equal (inclusive),
- less than (exclusive),
- less than or equal (inclusive) to a reference number, or
- between two numbers (inclusive or exclusive)
Range queries are also possible for string values.
Comparing to a Number
Numbers are not processed by Analyzers. They even bypass the identity
Analyzer. Therefore, you do not need to specify Analyzers in View definitions and no Analyzer context in queries with the ANALYZER()
function for numeric fields.
Dataset: IMDB movie dataset
View definition:
{
"links": {
"imdb_vertices": {
"fields": {
"runtime": { }
}
}
}
}
You can add a field with an empty object {}
as shown above, or explicitly set no Analyzer using an empty array []
as shown below.
{
"links": {
"imdb_vertices": {
"fields": {
"runtime": {
"analyzers": [ ]
}
}
}
}
}
AQL queries
Match movies with a runtime of exactly 5
minutes:
FOR doc IN imdb
SEARCH doc.runtime == 5
RETURN {
title: doc.title,
runtime: doc.runtime
}
title | runtime |
---|---|
The Spirit of Christmas | 5 |
Super Rhino | 5 |
Gone Nutty | 5 |
She and Her Cat | 5 |
… | … |
Note that no Analyzer context is set as numeric values are not processed by Analyzers at all.
Match movies with a runtime of 12
, 24
or 77
minutes:
FOR doc IN imdb
SEARCH doc.runtime IN [12, 24, 77]
RETURN {
title: doc.title,
runtime: doc.runtime
}
title | runtime |
---|---|
Frankenstein Punk | 12 |
The Vagabond | 24 |
The Rescuers Down Under | 77 |
The Olsen Gang | 77 |
Match movies with a runtime over 300
minutes and sort them from longest to shortest runtime:
FOR doc IN imdb
SEARCH doc.runtime > 300
SORT doc.runtime DESC
RETURN {
title: doc.title,
runtime: doc.runtime
}
title | runtime |
---|---|
Planet of the Apes Series | 605 |
North and South | 561 |
Into the West | 552 |
… | … |
The Manns - Novel of a Century | 312 |
Legenda o Tile | 311 |
Nárcisz és Psyché | 302 |
Any of the following comparison operators can be used: >
, >=
, <
, <=
, ==
, !=
, IN
, NOT IN
.
Comparing to a Numeric Range
While you can combine a greater than (or equal) and a less than (or equal) condition with a logical AND
, it is more efficient to perform such range queries using a single condition. You can use the range operator ..
to query for an inclusive range of values. You can also use the IN_RANGE()
function that allows you to specify individually whether the minimum and maximum value shall be included or excluded in the range.
AQL queries
Match movies with a runtime of 4
to 6
minutes with the range operator:
FOR doc IN imdb
SEARCH doc.runtime IN 4..6
RETURN {
title: doc.title,
runtime: doc.runtime
}
title | runtime |
---|---|
Katedra | 6 |
The Spirit of Christmas | 4 |
The Spirit of Christmas | 5 |
Hell’s Bells | 6 |
… | … |
While the dataset only has whole numbers, this query would also find decimal fractions within the defined range, e.g. 5.5
. This is different to how FILTER doc.runtime IN 4..6
works, which defines an integer range and only matches 4
, 5
and 6
.
Match movies with a runtime of 4
to 6
minutes with the IN_RANGE()
function (inclusive on both ends):
FOR doc IN imdb
SEARCH IN_RANGE(doc.runtime, 4, 6, true, true)
RETURN {
title: doc.title,
runtime: doc.runtime
}
title | runtime |
---|---|
Katedra | 6 |
The Spirit of Christmas | 4 |
The Spirit of Christmas | 5 |
Hell’s Bells | 6 |
… | … |
The standard comparison operators are still necessary for other kinds of range queries, such as for searching for values below and above a range. It is possible to express this with IN_RANGE()
and a negation (NOT
), but it is less efficient than using two comparisons and the negation makes it include documents that do not have a runtime attribute (implicitly null
).
Match movies with a runtime of 5
minutes or less, as well as 500
minutes or more, but not with a runtime of 0
minutes. Sort the matches by runtime in ascending order:
FOR doc IN imdb
SEARCH (doc.runtime <= 5 OR doc.runtime >= 500) AND doc.runtime != 0
SORT doc.runtime
RETURN {
title: doc.title,
runtime: doc.runtime
}
title | runtime |
---|---|
La Sortie des usines Lumière | 1 |
L’Arrivée d’un train à la Ciotat | 1 |
… | … |
Zippeldy & Fetterig | 5 |
Coda | 5 |
WWE: Greatest Wrestling Stars of the ’90s | 540 |
Shogun | 547 |
Into the West | 552 |
North and South | 561 |
Planet of the Apes Series | 605 |
The search expression could alternative be written in the following ways with the same result:
doc.runtime > 0 AND doc.runtime <= 5 OR doc.runtime >= 500
IN_RANGE(doc.runtime, 0, 5, false, true) OR doc.runtime >= 500
NOT(IN_RANGE(doc.runtime, 5, 500, false, false)) AND doc.runtime > 0
Comparing Strings
Comparison operators and the IN_RANGE()
function can also be used with strings. Beware that the results can be unexpected in some cases, because the internal sort order of ArangoSearch is different to how FILTER
, SORT
and other operations work.
The alphabetical order of characters is not taken into account by ArangoSearch, i.e. range queries in SEARCH operations against Views will not follow the language rules as per the defined Analyzer locale nor the server language (startup option --default-language
)! Also see Known Issues.
View definition:
{
"links": {
"imdb_vertices": {
"fields": {
"name": {
"analyzers": [ "identity" ]
}
}
}
}
}
AQL queries
Match movies where the name is >= Wu
and < Y
:
FOR doc IN imdb
SEARCH ANALYZER(IN_RANGE(doc.name, "Wu", "Y", true, false), "identity")
RETURN doc.name
Result |
---|
… |
Wu Zun |
Xiong Xin-Xin |
Wyatt |
Xander Berkeley |
Xzibit |
Xian Gao |
Wylie Watson |
Wyclef Jean |
… |