$lookup (aggregation)
Definition
New in version 3.2.
Performs a left outer join to an unsharded collection in the _same_database to filter in documents from the “joined” collection forprocessing. To each input document, the $lookup
stageadds a new array field whose elements are the matching documentsfrom the “joined” collection. The $lookup
stage passesthese reshaped documents to the next stage.
Syntax
The $lookup
stage has the following syntaxes:
Equality Match
To perform an equality match between a field from the input documentswith a field from the documents of the “joined” collection, the$lookup
stage has the following syntax:
- {
- $lookup:
- {
- from: <collection to join>,
- localField: <field from the input documents>,
- foreignField: <field from the documents of the "from" collection>,
- as: <output array field>
- }
- }
The $lookup
takes a document with the following fields:
Field | Description |
---|---|
from | Specifies the collection in the same database to performthe join with. The from collection cannot be sharded. Fordetails, see Sharded Collection Restrictions. |
localField | Specifies the field from the documents input to the$lookup stage. $lookup performs anequality match on the localField to theforeignField from the documents of the from collection. If an input document does not contain thelocalField , the $lookup treats thefield as having a value of null for matching purposes. |
foreignField | Specifies the field from the documents in the from collection. $lookup performs an equality match onthe foreignField to the localField from the inputdocuments. If a document in the from collection does notcontain the foreignField , the $lookup treatsthe value as null for matching purposes. |
as | Specifies the name of the new array field to add to the inputdocuments. The new array field contains the matchingdocuments from the from collection. If the specified namealready exists in the input document, the existing field isoverwritten. |
The operation would correspond to the following pseudo-SQL statement:
- SELECT *, <output array field>
- FROM collection
- WHERE <output array field> IN (SELECT *
- FROM <collection to join>
- WHERE <foreignField>= <collection.localField>);
See the following examples:
Join Conditions and Uncorrelated Sub-queries
New in version 3.6.
To perform uncorrelated subqueries between two collections as well asallow other join conditions besides a single equality match, the$lookup
stage has the following syntax:
- {
- $lookup:
- {
- from: <collection to join>,
- let: { <var_1>: <expression>, …, <var_n>: <expression> },
- pipeline: [ <pipeline to execute on the collection to join> ],
- as: <output array field>
- }
- }
The $lookup
takes a document with the following fields:
Field | Description |
---|---|
from | Specifies the collection in the same database to perform thejoin with. The from collection cannot be sharded. Fordetails, see Sharded Collection Restrictions. |
let | Optional. Specifies variables to use in the pipeline field stages. Usethe variable expressions to access the fields from the documentsinput to the $lookup stage.The pipeline cannot directly access the input documentfields. Instead, first define the variables for the inputdocument fields, and then reference the variables in the stagesin the pipeline .To access the let variables in the pipeline , use the$expr operator.NoteThe let variables are accessible by the stages in thepipeline , including additional $lookup stagesnested in the pipeline . |
pipeline | Specifies the pipeline to run on the joined collection. Thepipeline determines the resulting documents from the joinedcollection. To return all documents, specify an empty pipeline[] .The pipeline cannot include the $out stage orthe $merge stage.The pipeline cannot directly access the input documentfields. Instead, first define the variables for the inputdocument fields, and then reference the variables in the stagesin the pipeline .To access the let variables in the pipeline , use the$expr operator.NoteThe let variables are accessible by the stages in thepipeline , including additional $lookup stagesnested in the pipeline . |
as | Specifies the name of the new array field to add to the inputdocuments. The new array field contains the matchingdocuments from the from collection. If the specified namealready exists in the input document, the existing field isoverwritten. |
The operation would correspond to the following pseudo-SQL statement:
- SELECT *, <output array field>
- FROM collection
- WHERE <output array field> IN (SELECT <documents as determined from the pipeline>
- FROM <collection to join>
- WHERE <pipeline> );
See the following examples:
Consideration
Views and Collation
If performing an aggregation that involves multiple views, such aswith $lookup
or $graphLookup
, the views musthave the same collation.
Restrictions
Changed in version 4.2: You cannot include the
$out
or the$merge
stage in the$lookup
stage. That is, when specifying apipeline for the joined collection, you cannot include either stage inthepipeline
field.
- {
- $lookup:
- {
- from: <collection to join>,
- let: { <var_1>: <expression>, …, <var_n>: <expression> },
- pipeline: [ <pipeline to execute on the joined collection> ], // Cannot include $out or $merge
- as: <output array field>
- }
- }
Sharded Collection Restrictions
In the $lookup
stage, the from
collection cannot besharded. However, the collection on which you run theaggregate()
method can be sharded. That is, inthe following:
- db.collection.aggregate([
- { $lookup: { from: "fromCollection", ... } }
- ])
- The
collection
can be sharded. - The
fromCollection
cannot be sharded.
As such, to join a sharded collection with an unsharded collection, youcan run the aggregation on the sharded collection and lookup theunsharded collection; e.g.:
- db.shardedCollection.aggregate([
- { $lookup: { from: "unshardedCollection", ... } }
- ])
Alternatively, or to join multiple sharded collections, consider:
- Modifying client applications to perform manual lookups instead ofusing the
$lookup
aggregation stage. - If possible, using an embedded data model that removes the need to join collections.
Examples
Perform a Single Equality Join with $lookup
Create a collection orders
with the following documents:
- db.orders.insert([
- { "_id" : 1, "item" : "almonds", "price" : 12, "quantity" : 2 },
- { "_id" : 2, "item" : "pecans", "price" : 20, "quantity" : 1 },
- { "_id" : 3 }
- ])
Create another collection inventory
with the following documents:
- db.inventory.insert([
- { "_id" : 1, "sku" : "almonds", description: "product 1", "instock" : 120 },
- { "_id" : 2, "sku" : "bread", description: "product 2", "instock" : 80 },
- { "_id" : 3, "sku" : "cashews", description: "product 3", "instock" : 60 },
- { "_id" : 4, "sku" : "pecans", description: "product 4", "instock" : 70 },
- { "_id" : 5, "sku": null, description: "Incomplete" },
- { "_id" : 6 }
- ])
The following aggregation operation on the orders
collectionjoins the documents from orders
with the documents from theinventory
collection using the fields item
from theorders
collection and the sku
field from the inventory
collection:
- db.orders.aggregate([
- {
- $lookup:
- {
- from: "inventory",
- localField: "item",
- foreignField: "sku",
- as: "inventory_docs"
- }
- }
- ])
The operation returns the following documents:
- {
- "_id" : 1,
- "item" : "almonds",
- "price" : 12,
- "quantity" : 2,
- "inventory_docs" : [
- { "_id" : 1, "sku" : "almonds", "description" : "product 1", "instock" : 120 }
- ]
- }
- {
- "_id" : 2,
- "item" : "pecans",
- "price" : 20,
- "quantity" : 1,
- "inventory_docs" : [
- { "_id" : 4, "sku" : "pecans", "description" : "product 4", "instock" : 70 }
- ]
- }
- {
- "_id" : 3,
- "inventory_docs" : [
- { "_id" : 5, "sku" : null, "description" : "Incomplete" },
- { "_id" : 6 }
- ]
- }
The operation would correspond to the following pseudo-SQL statement:
- SELECT *, inventory_docs
- FROM orders
- WHERE inventory_docs IN (SELECT *
- FROM inventory
- WHERE sku= orders.item);
Use $lookup with an Array
Starting MongoDB 3.4, if the localField
is an array, you can matchthe array elements against a scalar foreignField
without needing an$unwind
stage.
For example, create an example collection classes
with thefollowing document:
- db.classes.insert( [
- { _id: 1, title: "Reading is ...", enrollmentlist: [ "giraffe2", "pandabear", "artie" ], days: ["M", "W", "F"] },
- { _id: 2, title: "But Writing ...", enrollmentlist: [ "giraffe1", "artie" ], days: ["T", "F"] }
- ])
Create another collection members
with the following documents:
- db.members.insert( [
- { _id: 1, name: "artie", joined: new Date("2016-05-01"), status: "A" },
- { _id: 2, name: "giraffe", joined: new Date("2017-05-01"), status: "D" },
- { _id: 3, name: "giraffe1", joined: new Date("2017-10-01"), status: "A" },
- { _id: 4, name: "panda", joined: new Date("2018-10-11"), status: "A" },
- { _id: 5, name: "pandabear", joined: new Date("2018-12-01"), status: "A" },
- { _id: 6, name: "giraffe2", joined: new Date("2018-12-01"), status: "D" }
- ])
The following aggregation operation joins documents in the classes
collection with the members
collection, matching on the members
field to the name
field:
- db.classes.aggregate([
- {
- $lookup:
- {
- from: "members",
- localField: "enrollmentlist",
- foreignField: "name",
- as: "enrollee_info"
- }
- }
- ])
The operation returns the following:
- {
- "_id" : 1,
- "title" : "Reading is ...",
- "enrollmentlist" : [ "giraffe2", "pandabear", "artie" ],
- "days" : [ "M", "W", "F" ],
- "enrollee_info" : [
- { "_id" : 1, "name" : "artie", "joined" : ISODate("2016-05-01T00:00:00Z"), "status" : "A" },
- { "_id" : 5, "name" : "pandabear", "joined" : ISODate("2018-12-01T00:00:00Z"), "status" : "A" },
- { "_id" : 6, "name" : "giraffe2", "joined" : ISODate("2018-12-01T00:00:00Z"), "status" : "D" }
- ]
- }
- {
- "_id" : 2,
- "title" : "But Writing ...",
- "enrollmentlist" : [ "giraffe1", "artie" ],
- "days" : [ "T", "F" ],
- "enrollee_info" : [
- { "_id" : 1, "name" : "artie", "joined" : ISODate("2016-05-01T00:00:00Z"), "status" : "A" },
- { "_id" : 3, "name" : "giraffe1", "joined" : ISODate("2017-10-01T00:00:00Z"), "status" : "A" }
- ]
- }
Use $lookup with $mergeObjects
Changed in version 3.6: MongoDB 3.6 adds the $mergeObjects
operator to combinemultiple documents into a single document
Create a collection orders
with the following documents:
- db.orders.insert([
- { "_id" : 1, "item" : "almonds", "price" : 12, "quantity" : 2 },
- { "_id" : 2, "item" : "pecans", "price" : 20, "quantity" : 1 }
- ])
Create another collection items
with the following documents:
- db.items.insert([
- { "_id" : 1, "item" : "almonds", description: "almond clusters", "instock" : 120 },
- { "_id" : 2, "item" : "bread", description: "raisin and nut bread", "instock" : 80 },
- { "_id" : 3, "item" : "pecans", description: "candied pecans", "instock" : 60 }
- ])
The following operation first uses the $lookup
stage tojoin the two collections by the item
fields and then uses$mergeObjects
in the $replaceRoot
to mergethe joined documents from items
and orders
:
- db.orders.aggregate([
- {
- $lookup: {
- from: "items",
- localField: "item", // field in the orders collection
- foreignField: "item", // field in the items collection
- as: "fromItems"
- }
- },
- {
- $replaceRoot: { newRoot: { $mergeObjects: [ { $arrayElemAt: [ "$fromItems", 0 ] }, "$$ROOT" ] } }
- },
- { $project: { fromItems: 0 } }
- ])
The operation returns the following documents:
- { "_id" : 1, "item" : "almonds", "description" : "almond clusters", "instock" : 120, "price" : 12, "quantity" : 2 }
- { "_id" : 2, "item" : "pecans", "description" : "candied pecans", "instock" : 60, "price" : 20, "quantity" : 1 }
Specify Multiple Join Conditions with $lookup
Changed in version 3.6: MongoDB 3.6 adds support for executing a pipeline on the joinedcollection, which allows for specifying multiple join conditions aswell as uncorrelated sub-queries.
Create a collection orders
with the following documents:
- db.orders.insert([
- { "_id" : 1, "item" : "almonds", "price" : 12, "ordered" : 2 },
- { "_id" : 2, "item" : "pecans", "price" : 20, "ordered" : 1 },
- { "_id" : 3, "item" : "cookies", "price" : 10, "ordered" : 60 }
- ])
Create another collection warehouses
with the following documents:
- db.warehouses.insert([
- { "_id" : 1, "stock_item" : "almonds", warehouse: "A", "instock" : 120 },
- { "_id" : 2, "stock_item" : "pecans", warehouse: "A", "instock" : 80 },
- { "_id" : 3, "stock_item" : "almonds", warehouse: "B", "instock" : 60 },
- { "_id" : 4, "stock_item" : "cookies", warehouse: "B", "instock" : 40 },
- { "_id" : 5, "stock_item" : "cookies", warehouse: "A", "instock" : 80 }
- ])
The following operation joins the orders
collection with thewarehouse
collection by the item and whether the quantity in stockis sufficient to cover the ordered quantity:
- db.orders.aggregate([
- {
- $lookup:
- {
- from: "warehouses",
- let: { order_item: "$item", order_qty: "$ordered" },
- pipeline: [
- { $match:
- { $expr:
- { $and:
- [
- { $eq: [ "$stock_item", "$$order_item" ] },
- { $gte: [ "$instock", "$$order_qty" ] }
- ]
- }
- }
- },
- { $project: { stock_item: 0, _id: 0 } }
- ],
- as: "stockdata"
- }
- }
- ])
The operation returns the following documents:
- { "_id" : 1, "item" : "almonds", "price" : 12, "ordered" : 2,
- "stockdata" : [ { "warehouse" : "A", "instock" : 120 }, { "warehouse" : "B", "instock" : 60 } ] }
- { "_id" : 2, "item" : "pecans", "price" : 20, "ordered" : 1,
- "stockdata" : [ { "warehouse" : "A", "instock" : 80 } ] }
- { "_id" : 3, "item" : "cookies", "price" : 10, "ordered" : 60,
- "stockdata" : [ { "warehouse" : "A", "instock" : 80 } ] }
The operation would correspond to the following pseudo-SQL statement:
- SELECT *, stockdata
- FROM orders
- WHERE stockdata IN (SELECT warehouse, instock
- FROM warehouses
- WHERE stock_item= orders.item
- AND instock >= orders.ordered );
See also
Uncorrelated Subquery
Changed in version 3.6: MongoDB 3.6 adds support for executing a pipeline on the joinedcollection, which allows for specifying multiple join conditions aswell as uncorrelated sub-queries.
Create a collection absences
with the following documents:
- db.absences.insert([
- { "_id" : 1, "student" : "Ann Aardvark", sickdays: [ new Date ("2018-05-01"),new Date ("2018-08-23") ] },
- { "_id" : 2, "student" : "Zoe Zebra", sickdays: [ new Date ("2018-02-01"),new Date ("2018-05-23") ] },
- ])
Create another collection holidays
with the following documents:
- db.holidays.insert([
- { "_id" : 1, year: 2018, name: "New Years", date: new Date("2018-01-01") },
- { "_id" : 2, year: 2018, name: "Pi Day", date: new Date("2018-03-14") },
- { "_id" : 3, year: 2018, name: "Ice Cream Day", date: new Date("2018-07-15") },
- { "_id" : 4, year: 2017, name: "New Years", date: new Date("2017-01-01") },
- { "_id" : 5, year: 2017, name: "Ice Cream Day", date: new Date("2017-07-16") }
- ])
The following operation joins the absences
collection with 2018holiday information from the holidays
collection:
- db.absences.aggregate([
- {
- $lookup:
- {
- from: "holidays",
- pipeline: [
- { $match: { year: 2018 } },
- { $project: { _id: 0, date: { name: "$name", date: "$date" } } },
- { $replaceRoot: { newRoot: "$date" } }
- ],
- as: "holidays"
- }
- }
- ])
The operation returns the following:
- { "_id" : 1, "student" : "Ann Aardvark", "sickdays" : [ ISODate("2018-05-01T00:00:00Z"), ISODate("2018-08-23T00:00:00Z") ],
- "holidays" : [ { "name" : "New Years", "date" : ISODate("2018-01-01T00:00:00Z") }, { "name" : "Pi Day", "date" : ISODate("2018-03-14T00:00:00Z") }, { "name" : "Ice Cream Day", "date" : ISODate("2018-07-15T00:00:00Z") } ] }
- { "_id" : 2, "student" : "Zoe Zebra", "sickdays" : [ ISODate("2018-02-01T00:00:00Z"), ISODate("2018-05-23T00:00:00Z") ],
- "holidays" : [ { "name" : "New Years", "date" : ISODate("2018-01-01T00:00:00Z") }, { "name" : "Pi Day", "date" : ISODate("2018-03-14T00:00:00Z") }, { "name" : "Ice Cream Day", "date" : ISODate("2018-07-15T00:00:00Z") } ] }
The operation would correspond to the following pseudo-SQL statement:
- SELECT *, holidays
- FROM absences
- WHERE holidays IN (SELECT name, date
- FROM holidays
- WHERE year = 2018);