Motor 和 PyMongo 的区别

主要的区别

创建一个连接(connection)

PyMongo的 MongoClientMongoReplicaSetClient 构造函数阻塞(block)直到它们创建了与MongoDB的连接. 然而 MotorClientMotorReplicaSetClient 是独立创建的. 在接受请求(requests)之前, 必须在Tornado web应用程序的开头调用 open_sync() :

import motor
client = motor.MotorClient().open_sync()

一旦应用程序运行后马上使连接异步, 要调用 open():

def opened(client, error):
    if error:
        print 'Error connecting!', error
    else:
        # Use the client
        pass

motor.MotorClient().open(opened)

Callbacks 和 Futures

Motor支持几乎所有PyMongo方法, 但需要网络I/O的Motor方法有一个可行的回调(callback)函数. 这个回调函数必须接收两个参数:

def callback(result, error):
    pass

当操作结束时, Motor的异步方法马上返回一个结果(result)或者错误(error), 并执行回调函数. 例如, find_one() 在PyMongo中是这样使用的:

db = MongoClient().test
user = db.users.find_one({'name': 'Jesse'})
print user

但Motor的 find_one() 方法是异步的:

db = MotorClient().open_sync().test

def got_user(user, error):
    if error:
        print 'error getting user!', error
    else:
        print user

db.users.find_one({'name': 'Jesse'}, callback=got_user)

callback必须以关键字参数形式传递, 而非位置参数.

查找多个文档, Motor提供 to_list():

def got_users(users, error):
    if error:
        print 'error getting users!', error
    else:
        for user in users:
            print user

db.users.find().to_list(length=10, callback=got_users)

See also

MotorCursor’s fetch_next()

If you pass no callback to an asynchronous method, it returns a Future for use in a coroutine:

from tornado import gen

@gen.coroutine
def f():
    yield motor_db.collection.insert({'name': 'Randall'})
    doc = yield motor_db.collection.find_one()

See With coroutines.

max_concurrent and max_wait_time

PyMongo allows the number of connections to MongoDB to grow to match the number of threads performing concurrent operations. (PyMongo’s max_pool_size merely caps the number of idle sockets kept open. [1]) MotorClient and MotorReplicaSetClient provide an additional option, max_concurrent, which caps the total number of sockets per host, per client. The default is 100. Once the cap is reached, operations yield to the IOLoop while waiting for a free socket. The optional max_wait_time allows operations to raise a MotorPoolTimeout if they can’t acquire a socket before the deadline.

Timeouts

In PyMongo, you can set a network timeout which causes an AutoReconnect exception if an operation does not complete in time:

db = MongoClient(socketTimeoutMS=500).test
try:
    user = db.users.find_one({'name': 'Jesse'})
    print user
except AutoReconnect:
    print 'timed out'

MotorClient and MotorReplicaSetClient support the same options:

db = MotorClient(socketTimeoutMS=500).open_sync().test

@gen.coroutine
def f():
    try:
        user = yield db.users.find_one({'name': 'Jesse'})
        print user
    except AutoReconnect:
        print 'timed out'

As in PyMongo, the default connectTimeoutMS is 20 seconds, and the default socketTimeoutMS is no timeout.

Requests

PyMongo provides “requests” to ensure that a series of operations are performed in order by the MongoDB server, even with unacknowledged writes (writes with w=0). Motor does not support requests, so the only way to guarantee order is by doing acknowledged writes. Register a callback for each operation and perform the next operation in the callback:

def inserted(result, error):
    if error:
        raise error

    db.users.find_one({'name': 'Ben'}, callback=found_one)

def found_one(result, error):
    if error:
        raise error

    print result

# Acknowledged insert:
db.users.insert({'name': 'Ben', 'maintains': 'Tornado'}, callback=inserted)

This ensures find_one isn’t run until insert has been acknowledged by the server. Obviously, this code is improved by tornado.gen:

@gen.coroutine
def f():
    yield db.users.insert({'name': 'Ben', 'maintains': 'Tornado'})
    result = yield db.users.find_one({'name': 'Ben'})
    print result

Motor ignores the auto_start_request parameter to MotorClient or MotorReplicaSetClient.

Threading and forking

Multithreading and forking are not supported; Motor is intended to be used in a single-threaded Tornado application. See Tornado’s documentation on running Tornado in production to take advantage of multiple cores.

Minor differences

Deprecated classes and options

PyMongo deprecated the slave_okay / slaveok option in favor of read preferences in version 2.3. It deprecated Connection and ReplicaSetConnection in favor of MongoClient and MongoReplicaSetClient in version 2.4, as well as deprecating the safe option in favor of write concerns. Motor supports none of PyMongo’s deprecated options and classes at all, and will raise ConfigurationError if you use them.

MasterSlaveConnection

PyMongo’s MasterSlaveConnection offers a few conveniences when connected to a MongoDB master-slave pair. Master-slave replication has long been superseded by replica sets, so Motor has no equivalent to MasterSlaveConnection.

GridFS

  • File-like

    PyMongo’s GridIn and GridOut strive to act like Python’s built-in file objects, so they can be passed to many functions that expect files. But the I/O methods of MotorGridIn and MotorGridOut are asynchronous, so they cannot obey the file API and aren’t suitable in the same circumstances as files.

  • Iteration

    It’s convenient in PyMongo to iterate a GridOut:

    fs = gridfs.GridFS(db)
    grid_out = fs.get(file_id)
    for chunk in grid_out:
        print chunk
    

    MotorGridOut cannot support this API asynchronously. To read a MotorGridOut use the non-blocking read() method. For convenience MotorGridOut provides stream_to_handler().

  • Setting properties

    In PyMongo, you can set arbitrary attributes on a GridIn and they’re stored as metadata on the server, even after the GridIn is closed:

    grid_in = fs.new_file()
    grid_in.close()
    grid_in.my_field = 'my_value'  # Sends update to server.
    

    Updating metadata on a MotorGridIn is asynchronous, so the API is different:

    @gen.coroutine
    def f():
        fs = motor.MotorGridFS(db)
        yield fs.open()
        grid_in = yield fs.new_file()
        yield grid_in.close()
    
        # Sends update to server.
        yield grid_in.set('my_field', 'my_value')
    
  • The “with” statement

    GridIn is a context manager–you can use it in a “with” statement and it is closed on exit:

    with fs.new_file() as grid_in:
        grid_in.write('data')
    

    But MotorGridIn‘s close() method is asynchronous, so it must be called explicitly.

is_locked

In PyMongo is_locked is a property of MongoClient. Since determining whether the server has been fsyncLocked requires I/O, Motor has no such convenience method. The equivalent in Motor is:

result = yield client.admin.current_op()
locked = bool(result.get('fsyncLock', None))

system_js

PyMongo supports Javascript procedures stored in MongoDB with syntax like:

>>> db.system_js.my_func = 'function(x) { return x * x; }'
>>> db.system_js.my_func(2)
4.0

Motor does not.

Cursor slicing

In Pymongo, the following raises an IndexError if the collection has fewer than 101 documents:

# Can raise IndexError.
doc = db.collection.find()[100]

In Motor, however, no exception is raised. The query simply has no results:

@gen.coroutine
def f():
    cursor = db.collection.find()[100]

    # Iterates zero or one times.
    while (yield cursor.fetch_next):
        doc = cursor.next_object()

The difference arises because the PyMongo Cursor‘s slicing operator blocks until it has queried the MongoDB server, and determines if a document exists at the desired offset; Motor simply returns a new MotorCursor with a skip and limit applied.

[1]See PyMongo’s max_pool_size