python使用多线程读取mongo的

2024-06-28 14:30:03 发布

男 | 程序猿一只，喜欢编程写python代码。

Just link the title, when I use multi thread to read the data from mongo, not fast even equal to only one process, is there something wrong i use?

我的多线程代码如下：

def multi_thread_flush(logger):
    n_loops = 20
    locks = []
    for i in range(0, n_loops):
        lock = thread.allocate_lock()
        lock.acquire()
        locks.append(lock)
    try:
        for i in range(0, n_loops):
            thread.start_new_thread(get_node_entry_id,
                                (logger, 0 + 400000 * i, 400000, locks[i],))
        for i in range(0, n_loops):
            while locks[i].locked(): pass
        logger.info("[all down] all down")
    except Exception as e:
        logger.error("exception: %s" % e)

def get_node_entry_id(logger, num1, num2, lock):
    cursor = client.mongo_collection.find({},no_cursor_timeout=True).skip(num1).batch_size(30)
    count = 0
    for item in cursor:
        if count > num2:
            break
        logger.info("%s" % item["_id"])
        count = count + 1
    lock.release()

我的一个流程代码如下：

^{pr2}$

我尝试将批处理大小从300更改为3000，但改进较小。在

Tags： the to in id lock for use count

1条回答

网友

1楼 · 发布于 2024-06-28 14:30:03

可能是因为蒙哥斯基普（）因为您的批量是400000+。因为每次执行查询时，服务器都必须从集合的开始一直走到指定的偏移量。见this doc。在

As the offset increases, mongo.skip() will be slower.

它还建议使用索引执行切片，例如：

db.col.find({_id: { $gt: offset}}).limit(batch_size)

python使用多线程读取mongo的

相关问题更多 >

编程相关推荐

热门问题

热门文章

python使用多线程读取mongo的

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >