mongodb protocol 分析

Mongo Wire Protocol

前一段时间研究一下mysql protocol 还有 mysql udf,mysql protocol 搞明白了,也用twisted 实现了一下mysql protocol.但结果令我很恼火,mysql protocol 不足够简洁,自己捣鼓的一个项目也做到一半做不下去了.然后就看到了mongodb protocol ,简单的看了一下–清爽,简洁.甚合我意,于是,研究之.

Introduction

mongo protocol 是一个简单的依据socket 的,请求-回应型的协议.用来进行mongo client 和 mongo server 之间的数据交互.
client 可以通过一个正常的 tcp/ip socket 来连接server.默认的client 和server 之间没有handshake(握手).

Messages Types and Formats

下面只讲几个我可能用到的几个消息类型 和格式

Standard Message Header

一般来说,叫做消息头.mongodb protocol 的消息都包含一个消息头.消息头的结构如下:

struct MsgHeader {
    int32   messageLength; // total message size, including this
    int32   requestID;     // identifier for this message
    int32   responseTo;    // requestID from the original request
                           //   (used in reponses from db)
    int32   opCode;        // request type - see table below
}

messageLength: 这个是整个消息的字节长度,包括它自己本身
requestID: 这个是client 生成的这个消息的标识符,server端会把这个requestID 放在responseTo 中传回来,client 就可以把返回的消息关联起来
responseTo:根据上面讲,值跟client 中的requestID 是一样的.
opCode: 下面会讲到

Opcode Name opCode value Comment
OP_REPLY 1 Reply to a client request. responseTo is set
OP_MSG 1000 generic msg command followed by a string
OP_UPDATE 2001 update document
OP_INSERT 2002 insert new document
RESERVED 2003 formerly used for OP_GET_BY_OID
OP_QUERY 2004 query a collection
OP_GET_MORE 2005 Get more data from a query. See Cursors
OP_DELETE 2006 Delete documents
OP_KILL_CURSORS 2007 Tell database client is done with a cursor

每一项占四个字节,MsgHeader 总共16个字节.

OP_QUERY

OP_QUERY 消息用来查询database 中的文档,格式如下:

struct OP_QUERY {
    MsgHeader header;                // standard message header
    int32     flags;                  // bit vector of query options.  See below for details.
    cstring   fullCollectionName;    // "dbname.collectionname"
    int32     numberToSkip;          // number of documents to skip
    int32     numberToReturn;        // number of documents to return
                                     //  in the first OP_REPLY batch
    document  query;                 // query object.  See below for details.
  [ document  returnFieldSelector; ] // Optional. Selector indicating the fields
                                     //  to return.  See below for details.
}

flags: 值如下

bit num name description
0 Reserved Must be set to 0.
1 TailableCursor Tailable means cursor is not closed when the last data is retrieved. Rather, the cursor marks the final object’s position. You can resume using the cursor later, from where it was located, if more data were received. Like any “latent cursor”, the cursor may become invalid at some point (CursorNotFound) – for example if the final object it references were deleted.
2 SlaveOk Allow query of replica slave. Normally these return an error except for namespace “local”.
3 OplogReplay Internal replication use only – driver should not set
4 NoCursorTimeout The server normally times out idle cursors after an inactivity period (10 minutes) to prevent excess memory use. Set this option to prevent that.
5 AwaitData Use with TailableCursor. If we are at the end of the data, block for a while rather than returning no data. After a timeout period, we do return as normal.
6 Exhaust Stream the data down full blast in multiple “more” packages, on the assumption that the client will fully read all data queried. Faster when you are pulling a lot of data and know you want to pull it all down. Note: the client is not allowed to not read all the data unless it closes the connection.
7 Partial Get partial results from a mongos if some shards are down (instead of throwing an error)
8-31 Reserved Must be set to 0.

fullCollectionName: collection(集合)名字.完整的collection 名字应该包括database(数据库)名字和collection(集合名字),中间用一个点连接.例如,database 名字为foo,collection 名字为bar.完成的collection 名字就为”foo.bar”
numberToSkip: 查询结果中忽略的数量,和sql 中的offset(位移)差不多.
numberToReturn: 限制返回结果的数量.如果查询的结果大于numberToReturn,server 端会建立一个标尺,并返回这个cursorID.和sql 中的limit 差不多
query: 一个包含查询信息的bson 格式的文档.这个查询包含一个或多个元素.可能的元素包括$query,$orderby,$hint,$explain,$snapshot.
returnFieldsSelector: 可选的bson 文档,来限制返回结果中的字段.

dadabase 会针对 OP_QUERY 返回一个OP_REPLY 消息.

OP_GETMORE

OP_QUERY 消息用来查询database 中的文档,格式如下:

struct {
    MsgHeader header;        // standard message header
    int32     ZERO;               // 0 - reserved for future use
    cstring   fullCollectionName; // "dbname.collectionname"
    int32     numberToReturn;   // number of documents to return
    int64     cursorID;  // cursorID from the OP_REPLY
}

fullCollectionName : collection(集合)名字.完整的collection 名字应该包括database(数据库)名字和collection(集合名字),中间用一个点连接.例如,database 名字为foo,collection 名字为bar.完成的collection 名字就为”foo.bar”
numberToReturn: 限制返回结果的数量.如果查询的结果大于numberToReturn,server 端会建立一个标尺,并返回这个cursorID.和sql 中的limit 差不多
cursorID: 执行OP_QUERY 时从database 返回的OP_REPLY 消息中的cursorID.

dadabase 会针对OP_GETMORE 返回一个OP_REPLY 消息.

OP_REPLY

OP_REPLY 消息用来回复OP_QUERY 和OP_GET_MORE .OP_REPLY 的格式如下:

struct {
    MsgHeader header;         // standard message header
    int32     responseFlags;  // bit vector - see details below
    int64     cursorID;       // cursor id if client needs to do get more's
    int32     startingFrom;   // where in the cursor this reply is starting
    int32     numberReturned; // number of documents in the reply
    document* documents;      // documents
}

responseFlags :

bit num name description
0 CursorNotFound Set when getMore is called but the cursor id is not valid at the server. Returned with zero results.
1 QueryFailure Set when query failed. Results consist of one document containing an “$err” field describing the failure.
2 ShardConfigStale Drivers should ignore this. Only mongos will ever see this set, in which case, it needs to update config from the server.
3 AwaitCapable Set when the server supports the AwaitData Query option. If it doesn’t, a client should sleep a little between getMore’s of a Tailable cursor. Mongod version 1.6 supports AwaitData and thus always sets AwaitCapable.
4-31 Reserved Ignore

cursorID:如果一个查询结果符合OP_REPLY 包,cursorID 会为0.cursorID 会在OP_GET_MORE 被用到用来获取更多的数据.

此条目发表在 python 分类目录。将固定链接加入收藏夹。

评论功能已关闭。