Model Tree Structures with Materialized Paths
Overview
This page describes a data model that describes a tree-likestructure in MongoDB documents by storing full relationship pathsbetween documents.
Pattern
The Materialized Paths pattern stores each tree node in a document;in addition to the tree node, document stores as a string the id(s) ofthe node’s ancestors or path. Although the Materialized Paths patternrequires additional steps of working with strings and regularexpressions, the pattern also provides more flexibility in working withthe path, such as finding nodes by partial paths.
Consider the following hierarchy of categories:
The following example models the tree using Materialized Paths,storing the path in the field path
; the path string uses the comma,
as a delimiter:
- db.categories.insert( { _id: "Books", path: null } )
- db.categories.insert( { _id: "Programming", path: ",Books," } )
- db.categories.insert( { _id: "Databases", path: ",Books,Programming," } )
- db.categories.insert( { _id: "Languages", path: ",Books,Programming," } )
- db.categories.insert( { _id: "MongoDB", path: ",Books,Programming,Databases," } )
- db.categories.insert( { _id: "dbm", path: ",Books,Programming,Databases," } )
- You can query to retrieve the whole tree, sorting by the field
path
:
- db.categories.find().sort( { path: 1 } )
- You can use regular expressions on the
path
field to find thedescendants ofProgramming
:
- db.categories.find( { path: /,Programming,/ } )
- You can also retrieve the descendants of
Books
where theBooks
is also at the topmost level of the hierarchy:
- db.categories.find( { path: /^,Books,/ } )
- To create an index on the field
path
use the followinginvocation:
- db.categories.createIndex( { path: 1 } )
This index may improve performance depending on the query:
For queries from the root
Books
sub-tree (e.g./^,Books,/
or/^,Books,Programming,/
), an index on thepath
fieldimproves the query performance significantly.For queries of sub-trees where the path from the root is notprovided in the query (e.g.
/,Databases,/
), or similar queriesof sub-trees, where the node might be in the middle of the indexedstring, the query must inspect the entire index.
For these queries an index may provide some performanceimprovement if the index is significantly smaller than theentire collection.