Aggregation

On this page

Aggregation operations process data records and return computed results. Aggregation operations group values from multiple documents together, and can perform a variety of operations on the grouped data to return a single result. MongoDB provides three ways to perform aggregation: theaggregation pipeline, themap-reduce function, andsingle purpose aggregation methods.

Aggregation Pipeline

MongoDB’saggregation frameworkis modeled on the concept of data processing pipelines. Documents enter a multi-stage pipeline that transforms the documents into an aggregated result.

The most basic pipeline stages provide_filters_that operate like queries and_document transformations_that modify the form of the output document.

Other pipeline operations provide tools for grouping and sorting documents by specific field or fields as well as tools for aggregating the contents of arrays, including arrays of documents. In addition, pipeline stages can useoperatorsfor tasks such as calculating the average or concatenating a string.

The pipeline provides efficient data aggregation using native operations within MongoDB, and is the preferred method for data aggregation in MongoDB.

The aggregation pipeline can operate on asharded collection.

The aggregation pipeline can use indexes to improve its performance during some of its stages. In addition, the aggregation pipeline has an internal optimization phase. SeePipeline Operators and IndexesandAggregation Pipeline Optimizationfor details.

聚合 - 图1

Map-Reduce

MongoDB also providesmap-reduceoperations to perform aggregation. In general, map-reduce operations have two phases: a_map_stage that processes each document and_emits_one or more objects for each input document, and_reduce_phase that combines the output of the map operation. Optionally, map-reduce can have a_finalize_stage to make final modifications to the result. Like other aggregation operations, map-reduce can specify a query condition to select the input documents as well as sort and limit the results.

Map-reduce uses custom JavaScript functions to perform the map and reduce operations, as well as the optional_finalize_operation. While the custom JavaScript provide great flexibility compared to the aggregation pipeline, in general, map-reduce is less efficient and more complex than the aggregation pipeline.

Map-reduce can operate on asharded collection. Map-reduce operations can also output to a sharded collection. SeeAggregation Pipeline and Sharded CollectionsandMap-Reduce and Sharded Collectionsfor details.

NOTE

Starting in MongoDB 2.4, certainmongoshell functions and properties are inaccessible in map-reduce operations. MongoDB 2.4 also provides support for multiple JavaScript operations to run at the same time. Before MongoDB 2.4, JavaScript code executed in a single thread, raising concurrency issues for map-reduce.

聚合 - 图2

Single Purpose Aggregation Operations

MongoDB also providesdb.collection.count()anddb.collection.distinct().

All of these operations aggregate documents from a single collection. While these operations provide simple access to common aggregation processes, they lack the flexibility and capabilities of the aggregation pipeline and map-reduce.

聚合 - 图3

Additional Features and Behaviors

For a feature comparison of the aggregation pipeline, map-reduce, and the special group functionality, seeAggregation Commands Comparison.

Additional Resources