MongoDB Wire Protocol
Introduction
The MongoDB Wire Protocol is a simple socket-based, request-responsestyle protocol. Clients communicate with the database server through aregular TCP/IP socket.
TCP/IP Socket
Clients should connect to the database with a regular TCP/IP socket.There is no connection handshake.
Port
The default port number for mongod
and mongos
instances is 27017. The port number for mongod
andmongos
is configurable and may vary.
Byte Ordering
All integers in the MongoDB wire protocol use little-endian byte order:that is, least-significant byte first.
Messages Types and Formats
There are two types of messages: client requests and database responses.
Note
- This page uses a C-like
struct
to describe the messagestructure. - The types used in this document (
cstring
,int32
, etc.) arethe same as those defined in the BSON specification. - To denote repetition, the document uses the asterisk notation fromthe BSON specification.For example,
int64*
indicates that one or more of thespecified type can be written to the socket, one after another. - The standard message header is typed as
MsgHeader
. Integerconstants are in capitals (e.g.ZERO
for the integer value of0).
Standard Message Header
In general, each message consists of a standard message header followedby request-specific data. The standard message header is structured asfollows:
- struct MsgHeader {
- int32 messageLength; // total message size, including this
- int32 requestID; // identifier for this message
- int32 responseTo; // requestID from the original request
- // (used in responses from db)
- int32 opCode; // request type - see table below for details
- }
Field | Description |
---|---|
messageLength | The total size of the message in bytes. This total includes the4 bytes that holds the message length. |
requestID | A client or database-generated identifier that uniquelyidentifies this message. For the case of client-generatedmessages (e.g. OP_QUERY andOP_GET_MORE), it will be returned inthe responseTo field of the OP_REPLYmessage. Clients can use the requestID and theresponseTo fields to associate query responses with theoriginating query. |
responseTo | In the case of a message from the database, this will be therequestID taken from the OP_QUERY orOP_GET_MORE messages from the client.Clients can use the requestID and the responseTo fieldsto associate query responses with the originating query. |
opCode | Type of message. See Request Opcodes for details. |
Request Opcodes
Note
Starting with MongoDB 2.6 and maxWireVersion
3
,MongoDB drivers use the database commandsinsert
, update
, and delete
instead of OP_INSERT
, OP_UPDATE
, and OP_DELETE
foracknowledged writes. Most drivers continue to use opcodes forunacknowledged writes.
In version 4.2, MongoDB removes the deprecated internal OP_COMMAND
andOP_COMMANDREPLY
protocol.
The following are the supported opCode
:
Opcode Name | Value | Comment |
---|---|---|
OP_REPLY | 1 | Reply to a client request. responseTo is set. |
OP_UPDATE | 2001 | Update document. |
OP_INSERT | 2002 | Insert new document. |
RESERVED | 2003 | Formerly used for OP_GET_BY_OID. |
OP_QUERY | 2004 | Query a collection. |
OP_GET_MORE | 2005 | Get more data from a query. See Cursors. |
OP_DELETE | 2006 | Delete documents. |
OP_KILL_CURSORS | 2007 | Notify database that the client has finished with the cursor. |
OP_MSG | 2013 | Send a message using the format introduced in MongoDB 3.6. |
Client Request Messages
Clients can send request messages that specify all but theOP_REPLY opCode. OP_REPLYis reserved for use by the database.
Only the OP_QUERY andOP_GET_MORE messages result in aresponse from the database. There will be no response sent for any othermessage.
You can determine if a message was successful with a getLastError command.
OP_UPDATE
The OP_UPDATE message is used to update a document in a collection. Theformat of a OP_UPDATE message is the following:
- struct OP_UPDATE {
- MsgHeader header; // standard message header
- int32 ZERO; // 0 - reserved for future use
- cstring fullCollectionName; // "dbname.collectionname"
- int32 flags; // bit vector. see below
- document selector; // the query to select the document
- document update; // specification of the update to perform
- }
Field | Description |
---|---|
header | Message header, as described in Standard Message Header. |
ZERO | Integer value of 0. Reserved for future use. |
fullCollectionName | The full collection name; i.e. namespace. The full collectionname is the concatenation of the database name with thecollection name, using a . for the concatenation. Forexample, for the database foo and the collection bar ,the full collection name is foo.bar . |
flags | Bit vector to specify flags for the operation. The bit valuescorrespond to the following:- 0 corresponds to Upsert. If set, the database will insertthe supplied object into the collection if no matchingdocument is found.- 1 corresponds to MultiUpdate.If set, the database willupdate all matching objects in the collection. Otherwise onlyupdates first matching document.- 2 -31 are reserved. Must be set to 0. |
selector | BSON document that specifies the query for selection of thedocument to update. |
update | BSON document that specifies the update to be performed. Forinformation on specifying updates see the UpdateOperations documentation from theMongoDB Manual. |
There is no response to an OP_UPDATE message.
OP_INSERT
The OP_INSERT message is used to insert one or more documents into acollection. The format of the OP_INSERT message is
- struct {
- MsgHeader header; // standard message header
- int32 flags; // bit vector - see below
- cstring fullCollectionName; // "dbname.collectionname"
- document* documents; // one or more documents to insert into the collection
- }
Field | Description |
---|---|
header | Message header, as described in Standard Message Header. |
flags | Bit vector to specify flags for the operation. The bit valuescorrespond to the following:- 0 corresponds to ContinueOnError. If set, the databasewill not stop processing a bulk insert if one fails (eg due toduplicate IDs). This makes bulk insert behave similarly to aseries of single inserts, except lastError will be set if anyinsert fails, not just the last one. If multiple errors occur,only the most recent will be reported by getLastError. (new in1.9.1)- 1 -31 are reserved. Must be set to 0. |
fullCollectionName | The full collection name; i.e. namespace. The full collectionname is the concatenation of the database name with thecollection name, using a . for the concatenation. Forexample, for the database foo and the collection bar ,the full collection name is foo.bar . |
documents | One or more documents to insert into the collection. If thereare more than one, they are written to the socket in sequence,one after another. |
There is no response to an OP_INSERT message.
OP_QUERY
The OP_QUERY message is used to query the database for documents in acollection. The format of the OP_QUERY message is:
- struct OP_QUERY {
- MsgHeader header; // standard message header
- int32 flags; // bit vector of query options. See below for details.
- cstring fullCollectionName ; // "dbname.collectionname"
- int32 numberToSkip; // number of documents to skip
- int32 numberToReturn; // number of documents to return
- // in the first OP_REPLY batch
- document query; // query object. See below for details.
- [ document returnFieldsSelector; ] // Optional. Selector indicating the fields
- // to return. See below for details.
- }
Field | Description |
---|---|
header | Message header, as described in Standard Message Header. |
flags | Bit vector to specify flags for the operation. The bit valuescorrespond to the following:- 0 is reserved. Must be set to 0.- 1 corresponds to TailableCursor. Tailable means cursor isnot closed when the last data is retrieved. Rather, the cursormarks the final object’s position. You can resume using thecursor later, from where it was located, if more data werereceived. Like any “latent cursor”, the cursor may becomeinvalid at some point (CursorNotFound) – for example if thefinal object it references were deleted.- 2 corresponds to SlaveOk.Allow query of replica slave.Normally these return an error except for namespace “local”.- 3 corresponds to OplogReplay. Internal replication useonly - driver should not set.- 4 corresponds to NoCursorTimeout. The server normallytimes out idle cursors after an inactivity period (10 minutes)to prevent excess memory use. Set this option to prevent that.- 5 corresponds to AwaitData. Use with TailableCursor. If weare at the end of the data, block for a while rather thanreturning no data. After a timeout period, we do return asnormal.- 6 corresponds to Exhaust. Stream the data down full blastin multiple “more” packages, on the assumption that the clientwill fully read all data queried. Faster when you are pullinga lot of data and know you want to pull it all down. Note: theclient is not allowed to not read all the data unless itcloses the connection.- 7 corresponds to Partial. Get partial results from amongos if some shards are down (instead of throwing an error)- 8 -31 are reserved. Must be set to 0. |
fullCollectionName | The full collection name; i.e. namespace. The full collectionname is the concatenation of the database name with thecollection name, using a . for the concatenation. Forexample, for the database foo and the collection bar ,the full collection name is foo.bar . |
numberToSkip | Sets the number of documents to omit - starting from the firstdocument in the resulting dataset - when returning the result ofthe query. |
numberToReturn | Limits the number of documents in the first OP_REPLY message to the query. However, the databasewill still establish a cursor and return the cursorID to theclient if there are more results than numberToReturn . If theclient driver offers ‘limit’ functionality (like the SQL LIMITkeyword), then it is up to the client driver to ensure that nomore than the specified number of document are returned to thecalling application. If numberToReturn is 0 , the db willuse the default return size. If the number is negative, then thedatabase will return that number and close the cursor. No furtherresults for that query can be fetched. If numberToReturn is1 the server will treat it as -1 (closing the cursorautomatically). |
query | BSON document that represents the query. The query will containone or more elements, all of which must match for a document tobe included in the result set. Possible elements include$query , $orderby , $hint , and $explain . |
returnFieldsSelector | Optional. BSON document that limits the fields in the returneddocuments. The returnFieldsSelector contains one or moreelements, each of which is the name of a field that should bereturned, and and the integer value 1 . In JSON notation, areturnFieldsSelector to limit to the fields a , b andc would be:
|
The database will respond to an OP_QUERY message with anOP_REPLY message.
OP_GET_MORE
The OP_GET_MORE message is used to query the database for documents in acollection. The format of the OP_GET_MORE message is:
- struct {
- MsgHeader header; // standard message header
- int32 ZERO; // 0 - reserved for future use
- cstring fullCollectionName; // "dbname.collectionname"
- int32 numberToReturn; // number of documents to return
- int64 cursorID; // cursorID from the OP_REPLY
- }
Field | Description |
---|---|
header | Message header, as described in Standard Message Header. |
ZERO | Integer value of 0. Reserved for future use. |
fullCollectionName | The full collection name; i.e. namespace. The full collectionname is the concatenation of the database name with thecollection name, using a . for the concatenation. Forexample, for the database foo and the collection bar ,the full collection name is foo.bar . |
numberToReturn | Limits the number of documents in the first OP_REPLY message to the query. However, the databasewill still establish a cursor and return the cursorID to theclient if there are more results than numberToReturn . If theclient driver offers ‘limit’ functionality (like the SQL LIMITkeyword), then it is up to the client driver to ensure that nomore than the specified number of document are returned to thecalling application. If numberToReturn is 0 , the db willused the default return size. |
cursorID | Cursor identifier that came in the OP_REPLY. This must be the value that came from thedatabase. |
The database will respond to an OP_GET_MORE message with anOP_REPLY message.
OP_DELETE
The OP_DELETE message is used to remove one or more documents from acollection. The format of the OP_DELETE message is:
- struct {
- MsgHeader header; // standard message header
- int32 ZERO; // 0 - reserved for future use
- cstring fullCollectionName; // "dbname.collectionname"
- int32 flags; // bit vector - see below for details.
- document selector; // query object. See below for details.
- }
Field | Description |
---|---|
header | Message header, as described in Standard Message Header. |
ZERO | Integer value of 0. Reserved for future use. |
fullCollectionName | The full collection name; i.e. namespace. The full collectionname is the concatenation of the database name with thecollection name, using a . for the concatenation. Forexample, for the database foo and the collection bar ,the full collection name is foo.bar . |
flags | Bit vector to specify flags for the operation. The bit valuescorrespond to the following:- 0 corresponds to SingleRemove. If set, the database willremove only the first matching document in the collection.Otherwise all matching documents will be removed.- 1 -31 are reserved. Must be set to 0. |
selector | BSON document that represent the query used to select thedocuments to be removed. The selector will contain one or moreelements, all of which must match for a document to be removedfrom the collection. |
There is no response to an OP_DELETE message.
OP_KILL_CURSORS
The OP_KILL_CURSORS message is used to close an active cursor in thedatabase. This is necessary to ensure that database resources arereclaimed at the end of the query. The format of the OP_KILL_CURSORSmessage is:
- struct {
- MsgHeader header; // standard message header
- int32 ZERO; // 0 - reserved for future use
- int32 numberOfCursorIDs; // number of cursorIDs in message
- int64* cursorIDs; // sequence of cursorIDs to close
- }
Field | Description |
---|---|
header | Message header, as described in Standard Message Header. |
ZERO | Integer value of 0. Reserved for future use. |
numberOfCursorIDs | The number of cursor IDs that are in the message. |
cursorIDs | “Array” of cursor IDs to be closed. If there are more than one,they are written to the socket in sequence, one after another. |
If a cursor is read until exhausted (read until OP_QUERYor OP_GET_MORE returns zerofor the cursor id), there is no need to kill the cursor.
OP_MSG
New in version MongoDB: 3.6
OP_MSG
is an extensible message format designed to subsume thefunctionality of other opcodes. This opcode has the following format:
- OP_MSG {
- MsgHeader header; // standard message header
- uint32 flagBits; // message flags
- Sections[] sections; // data sections
- optional<uint32> checksum; // optional CRC-32C checksum
- }
Field | Description |
---|---|
header | Standard message header, as described in Standard Message Header. |
flagBits | An integer bitmask containing message flags, as described inFlag Bits. |
sections | Message body sections, as described in Sections. |
checksum | An optional CRC-32C checksum, as described inChecksum. |
Flag Bits
The flagBits
integer is a bitmask encoding flags that modify theformat and behavior of OP_MSG
.
The first 16 bits (0-15) are required and parsers MUSTerror if an unknown bit is set.
The last 16 bits (16-31) are optional, and parsers MUSTignore any unknown set bits. Proxies and other message forwardersMUST clear any unknown optional bits before forwarding messages.
Bit | Name | Request | Response | Description |
---|---|---|---|---|
0 | checksumPresent | ✓ | ✓ | The message ends with 4 bytes containing a CRC-32C [1]checksum. See Checksum for details. |
1 | moreToCome | ✓ | ✓ | Another message will follow this one without further action fromthe receiver. The receiver MUST NOT send another message untilreceiving one with moreToCome set to 0 as sends may block,causing deadlock. Requests with the moreToCome bit set will not receive a reply. Replies will only have thisset in response to requests with the exhaustAllowed bit set. |
16 | exhaustAllowed | ✓ | The client is prepared for multiple replies to this request usingthe moreToCome bit. The server will never produce replies withthe moreToCome bit set unless the request has this bit set.This ensures that multiple replies are only sent when the networklayer of the requester is prepared for them.ImportantMongoDB 3.6 ignores this flag, and will respond with a singlemessage. |
Sections
An OP_MSG
message contains one or more sections. Each section startswith a kind
byte indicating its type. Everything after the kind
byte constitutes the section’s payload.
The available kinds of sections follow.
Kind 0: Body
A body section is encoded as a singleBSON object.The size in the BSON object also serves as the size of the section. Thissection kind is the standard command request and reply body.
All top-level fields MUST have a unique name.
Kind 1: Document Sequence
Type | Description |
---|---|
int32 | Size of the section in bytes. |
C String | Document sequence identifier. In all current commands this fieldis the (possibly nested) field that it is replacing from the bodysection.This field MUST NOT also exist in the body section. |
Zero or more BSON objects | - Objects are sequenced back to back with no separators.- Each object is limited to the maxBSONObjectSize of theserver. The combination of all objects is not limited tomaxBSONObjSize .- The document sequence ends once size bytes have been consumed.- Parsers MAY choose to merge these objects into the body asan array at the path specified by the sequence identifier whenconverting to language-level objects. |
Checksum
Each message MAY end with a CRC-32C [1] checksum that covers allbytes in the message except for the checksum itself.
Starting in MongoDB 4.2:
mongod
instances,mongos
instances, andmongo
shell instances will exchange messages withchecksums if not using TLS/SSL connection.mongod
instances,mongos
instances, andmongo
shell instances will skip the checksum if usingTLS/SSL connection.
Drivers and older binaries will ignore the checksum if presented withmessages with checksum.
The presence of a checksum is indicated by the checksumPresent
flagbit.
Database Response Messages
OP_REPLY
The OP_REPLY
message is sent by the database in response to anOP_QUERY or OP_GET_MORE message. The format of an OP_REPLY message is:
- struct {
- MsgHeader header; // standard message header
- int32 responseFlags; // bit vector - see details below
- int64 cursorID; // cursor id if client needs to do get more's
- int32 startingFrom; // where in the cursor this reply is starting
- int32 numberReturned; // number of documents in the reply
- document* documents; // documents
- }
Field | Description |
---|---|
header | Message header, as described in Standard Message Header. |
responseFlags | Bit vector to specify flags. The bit valuescorrespond to the following:- 0 corresponds to CursorNotFound. Is set when getMore iscalled but the cursor id is not valid at the server. Returnedwith zero results.- 1 corresponds to QueryFailure. Is set when query failed.Results consist of one document containing an “$err” fielddescribing the failure.- 2 corresponds to ShardConfigStale. Drivers should ignorethis. Only mongos will ever see this set, in whichcase, it needs to update config from the server.- 3 corresponds to AwaitCapable. Is set when the serversupports the AwaitData Query option. If it doesn’t, a clientshould sleep a little between getMore’s of a Tailable cursor.Mongod version 1.6 supports AwaitData and thus always setsAwaitCapable.- 4 -31 are reserved. Ignore. |
cursorID | The cursorID that this OP_REPLY is a part of. In the eventthat the result set of the query fits into one OP_REPLY message,cursorID will be 0. This cursorID must be used in anyOP_GET_MORE messages used to get moredata, and also must be closed by the client when no longerneeded via a OP_KILL_CURSORSmessage. |
startingFrom | Starting position in the cursor. |
numberReturned | Number of documents in the reply. |
documents | Returned documents. |
Footnotes
[1] | (1, 2) 32-bit CRC computed with the Castagnoli polynomial asdescribed by https://tools.ietf.org/html/rfc4960#page-140. |