$graphLookup (aggregation)
Changed in version 3.4.
Definition
$graphLookup
- Performs a recursive search on a collection, with options forrestricting the search by recursion depth and query filter.
The $graphLookup
search process is summarized below:
Input documents flow into the
$graphLookup
stage of anaggregation operation.$graphLookup
targets the search to the collectiondesignated by thefrom
parameter (see below for fulllist of search parameters).For each input document, the search begins with the valuedesignated by
startWith
.$graphLookup
matches thestartWith
valueagainst the field designated byconnectToField
in otherdocuments in thefrom
collection.For each matching document,
$graphLookup
takes the value oftheconnectFromField
and checks every document in thefrom
collection for a matchingconnectToField
value. Foreach match,$graphLookup
adds the matching document in thefrom
collection to an array field named by theas
parameter.
This step continues recursively until no more matching documentsare found, or until the operation reaches a recursion depthspecified by the maxDepth
parameter. $graphLookup
thenappends the array field to the input document. $graphLookup
returns results after completing its search on all inputdocuments.
$graphLookup
has the following prototype form:
- {
- $graphLookup: {
- from: <collection>,
- startWith: <expression>,
- connectFromField: <string>,
- connectToField: <string>,
- as: <string>,
- maxDepth: <number>,
- depthField: <string>,
- restrictSearchWithMatch: <document>
- }
- }
$graphLookup
takes a document with the following fields:
FieldDescriptionfrom
Target collection for the $graphLookup
operation to search, recursively matching theconnectFromField
to the connectToField
.The from
collection cannot besharded and must be in the samedatabase as any other collections used in the operation.For information, see Sharded Collections.startWith
Expression that specifiesthe value of the connectFromField
with which to start therecursive search. Optionally, startWith
may be array ofvalues, each of which is individually followed through thetraversal process.connectFromField
Field name whose value $graphLookup
uses torecursively match against the connectToField
of otherdocuments in the collection. If the value is an array, eachelement is individually followed through thetraversal process.connectToField
Field name in other documents against which to match thevalue of the field specified by the connectFromField
parameter.as
Name of the array field added to each output document.Contains the documents traversed in the$graphLookup
stage to reach the document.
Note
Documents returned in the as
field are not guaranteedto be in any order.
maxDepth
Optional. Non-negative integral number specifying themaximum recursion depth.depthField
Optional. Name of the field to add to each traverseddocument in the search path. The value of this fieldis the recursion depth for the document, represented as aNumberLong
. Recursion depthvalue starts at zero, so the first lookup corresponds tozero depth.restrictSearchWithMatch
Optional. A document specifying additional conditionsfor the recursive search. The syntax is identical toquery filter syntax.
Note
You cannot use any aggregation expression in this filter. For example, aquery document such as
- { lastName: { $ne: "$lastName" } }
will not work in this context to find documents in whichthe lastName
value is different from the lastName
value of the input document, because "$lastName"
willact as a string literal, not a field path.
Considerations
Sharded Collections
The collection specified in from
cannot besharded. However, the collection on which you run theaggregate()
method can be sharded. That is, inthe following:
- db.collection.aggregate([
- { $graphLookup: { from: "fromCollection", ... } }
- ])
- The
collection
can be sharded. - The
fromCollection
cannot be sharded.
To join multiple sharded collections, consider:
- Modifying client applications to perform manual lookups instead ofusing the
$graphLookup
aggregation stage. - If possible, using an embedded data model that removes the need to join collections.
Max Depth
Setting the maxDepth
field to 0
is equivalent to anon-recursive $graphLookup
search stage.
Memory
The $graphLookup
stage must stay within the 100 megabytememory limit. If allowDiskUse: true
is specified for theaggregate()
operation, the$graphLookup
stage ignores the option. If there are otherstages in the aggregate()
operation,allowDiskUse: true
option is in effect for these other stages.
See aggregration pipeline limitations for more information.
Views and Collation
If performing an aggregation that involves multiple views, such aswith $lookup
or $graphLookup
, the views musthave the same collation.
Examples
Within a Single Collection
A collection named employees
has the following documents:
- { "_id" : 1, "name" : "Dev" }
- { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }
- { "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" }
- { "_id" : 4, "name" : "Andrew", "reportsTo" : "Eliot" }
- { "_id" : 5, "name" : "Asya", "reportsTo" : "Ron" }
- { "_id" : 6, "name" : "Dan", "reportsTo" : "Andrew" }
The following $graphLookup
operation recursively matcheson the reportsTo
and name
fields in the employees
collection, returning the reporting hierarchy for each person:
- db.employees.aggregate( [
- {
- $graphLookup: {
- from: "employees",
- startWith: "$reportsTo",
- connectFromField: "reportsTo",
- connectToField: "name",
- as: "reportingHierarchy"
- }
- }
- ] )
The operation returns the following:
- {
- "_id" : 1,
- "name" : "Dev",
- "reportingHierarchy" : [ ]
- }
- {
- "_id" : 2,
- "name" : "Eliot",
- "reportsTo" : "Dev",
- "reportingHierarchy" : [
- { "_id" : 1, "name" : "Dev" }
- ]
- }
- {
- "_id" : 3,
- "name" : "Ron",
- "reportsTo" : "Eliot",
- "reportingHierarchy" : [
- { "_id" : 1, "name" : "Dev" },
- { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }
- ]
- }
- {
- "_id" : 4,
- "name" : "Andrew",
- "reportsTo" : "Eliot",
- "reportingHierarchy" : [
- { "_id" : 1, "name" : "Dev" },
- { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }
- ]
- }
- {
- "_id" : 5,
- "name" : "Asya",
- "reportsTo" : "Ron",
- "reportingHierarchy" : [
- { "_id" : 1, "name" : "Dev" },
- { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" },
- { "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" }
- ]
- }
- {
- "_id" : 6,
- "name" : "Dan",
- "reportsTo" : "Andrew",
- "reportingHierarchy" : [
- { "_id" : 1, "name" : "Dev" },
- { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" },
- { "_id" : 4, "name" : "Andrew", "reportsTo" : "Eliot" }
- ]
- }
The following table provides a traversal path for thedocument { "_id" : 5, "name" : "Asya", "reportsTo" : "Ron" }
:
Start value | The reportsTo value of the document:
|
---|---|
Depth 0 |
|
Depth 1 |
|
Depth 2 |
|
The output generates the hierarchyAsya -> Ron -> Eliot -> Dev
.
Across Multiple Collections
Like $lookup
, $graphLookup
can accessanother collection in the same database.
In the following example, a database contains two collections:
- A collection
airports
with the following documents:
- { "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ] }
- { "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ] }
- { "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ] }
- { "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ] }
- { "_id" : 4, "airport" : "LHR", "connects" : [ "PWM" ] }
- A collection
travelers
with the following documents:
- { "_id" : 1, "name" : "Dev", "nearestAirport" : "JFK" }
- { "_id" : 2, "name" : "Eliot", "nearestAirport" : "JFK" }
- { "_id" : 3, "name" : "Jeff", "nearestAirport" : "BOS" }
For each document in the travelers
collection, the followingaggregation operation looks up the nearestAirport
value in theairports
collection and recursively matches the connects
field to the airport
field. The operation specifies a maximumrecursion depth of 2
.
- db.travelers.aggregate( [
- {
- $graphLookup: {
- from: "airports",
- startWith: "$nearestAirport",
- connectFromField: "connects",
- connectToField: "airport",
- maxDepth: 2,
- depthField: "numConnections",
- as: "destinations"
- }
- }
- ] )
The operation returns the following results:
- {
- "_id" : 1,
- "name" : "Dev",
- "nearestAirport" : "JFK",
- "destinations" : [
- { "_id" : 3,
- "airport" : "PWM",
- "connects" : [ "BOS", "LHR" ],
- "numConnections" : NumberLong(2) },
- { "_id" : 2,
- "airport" : "ORD",
- "connects" : [ "JFK" ],
- "numConnections" : NumberLong(1) },
- { "_id" : 1,
- "airport" : "BOS",
- "connects" : [ "JFK", "PWM" ],
- "numConnections" : NumberLong(1) },
- { "_id" : 0,
- "airport" : "JFK",
- "connects" : [ "BOS", "ORD" ],
- "numConnections" : NumberLong(0) }
- ]
- }
- {
- "_id" : 2,
- "name" : "Eliot",
- "nearestAirport" : "JFK",
- "destinations" : [
- { "_id" : 3,
- "airport" : "PWM",
- "connects" : [ "BOS", "LHR" ],
- "numConnections" : NumberLong(2) },
- { "_id" : 2,
- "airport" : "ORD",
- "connects" : [ "JFK" ],
- "numConnections" : NumberLong(1) },
- { "_id" : 1,
- "airport" : "BOS",
- "connects" : [ "JFK", "PWM" ],
- "numConnections" : NumberLong(1) },
- { "_id" : 0,
- "airport" : "JFK",
- "connects" : [ "BOS", "ORD" ],
- "numConnections" : NumberLong(0) } ]
- }
- {
- "_id" : 3,
- "name" : "Jeff",
- "nearestAirport" : "BOS",
- "destinations" : [
- { "_id" : 2,
- "airport" : "ORD",
- "connects" : [ "JFK" ],
- "numConnections" : NumberLong(2) },
- { "_id" : 3,
- "airport" : "PWM",
- "connects" : [ "BOS", "LHR" ],
- "numConnections" : NumberLong(1) },
- { "_id" : 4,
- "airport" : "LHR",
- "connects" : [ "PWM" ],
- "numConnections" : NumberLong(2) },
- { "_id" : 0,
- "airport" : "JFK",
- "connects" : [ "BOS", "ORD" ],
- "numConnections" : NumberLong(1) },
- { "_id" : 1,
- "airport" : "BOS",
- "connects" : [ "JFK", "PWM" ],
- "numConnections" : NumberLong(0) }
- ]
- }
The following table provides a traversal path for the recursivesearch, up to depth 2
, where the starting airport
is JFK
:
Start value | The nearestAirport value from the travelers collection:
|
---|---|
Depth 0 |
|
Depth 1 |
|
Depth 2 |
|
With a Query Filter
The following example uses a collection with a setof documents containing names of people along with arrays of theirfriends and their hobbies. An aggregation operation finds oneparticular person and traverses her network of connections to findpeople who list golf
among their hobbies.
A collection named people
contains the following documents:
- {
- "_id" : 1,
- "name" : "Tanya Jordan",
- "friends" : [ "Shirley Soto", "Terry Hawkins", "Carole Hale" ],
- "hobbies" : [ "tennis", "unicycling", "golf" ]
- }
- {
- "_id" : 2,
- "name" : "Carole Hale",
- "friends" : [ "Joseph Dennis", "Tanya Jordan", "Terry Hawkins" ],
- "hobbies" : [ "archery", "golf", "woodworking" ]
- }
- {
- "_id" : 3,
- "name" : "Terry Hawkins",
- "friends" : [ "Tanya Jordan", "Carole Hale", "Angelo Ward" ],
- "hobbies" : [ "knitting", "frisbee" ]
- }
- {
- "_id" : 4,
- "name" : "Joseph Dennis",
- "friends" : [ "Angelo Ward", "Carole Hale" ],
- "hobbies" : [ "tennis", "golf", "topiary" ]
- }
- {
- "_id" : 5,
- "name" : "Angelo Ward",
- "friends" : [ "Terry Hawkins", "Shirley Soto", "Joseph Dennis" ],
- "hobbies" : [ "travel", "ceramics", "golf" ]
- }
- {
- "_id" : 6,
- "name" : "Shirley Soto",
- "friends" : [ "Angelo Ward", "Tanya Jordan", "Carole Hale" ],
- "hobbies" : [ "frisbee", "set theory" ]
- }
The following aggregation operation uses three stages:
$match
matches on documents with aname
fieldcontaining the string"Tanya Jordan"
. Returns one outputdocument.$graphLookup
connects the output document’sfriends
field with thename
field of other documents in thecollection to traverseTanya Jordan's
network of connections.This stage uses therestrictSearchWithMatch
parameter to findonly documents in which thehobbies
array containsgolf
.Returns one output document.$project
shapes the output document. The names listed inconnections who play golf
are taken from thename
field of thedocuments listed in the input document’sgolfers
array.
- db.people.aggregate( [
- { $match: { "name": "Tanya Jordan" } },
- { $graphLookup: {
- from: "people",
- startWith: "$friends",
- connectFromField: "friends",
- connectToField: "name",
- as: "golfers",
- restrictSearchWithMatch: { "hobbies" : "golf" }
- }
- },
- { $project: {
- "name": 1,
- "friends": 1,
- "connections who play golf": "$golfers.name"
- }
- }
- ] )
The operation returns the following document:
- {
- "_id" : 1,
- "name" : "Tanya Jordan",
- "friends" : [
- "Shirley Soto",
- "Terry Hawkins",
- "Carole Hale"
- ],
- "connections who play golf" : [
- "Joseph Dennis",
- "Tanya Jordan",
- "Angelo Ward",
- "Carole Hale"
- ]
- }