Just link the title, when I use multi thread to read the data from mongo, not fast even equal to only one process, is there something wrong i use?
我的多线程代码如下:
def multi_thread_flush(logger):
n_loops = 20
locks = []
for i in range(0, n_loops):
lock = thread.allocate_lock()
lock.acquire()
locks.append(lock)
try:
for i in range(0, n_loops):
thread.start_new_thread(get_node_entry_id,
(logger, 0 + 400000 * i, 400000, locks[i],))
for i in range(0, n_loops):
while locks[i].locked(): pass
logger.info("[all down] all down")
except Exception as e:
logger.error("exception: %s" % e)
def get_node_entry_id(logger, num1, num2, lock):
cursor = client.mongo_collection.find({},no_cursor_timeout=True).skip(num1).batch_size(30)
count = 0
for item in cursor:
if count > num2:
break
logger.info("%s" % item["_id"])
count = count + 1
lock.release()
我的一个流程代码如下:
^{pr2}$我尝试将批处理大小从300更改为3000,但改进较小。在
可能是因为蒙哥斯基普()因为您的批量是400000+。因为每次执行查询时,服务器都必须从集合的开始一直走到指定的偏移量。见this doc。在
它还建议使用索引执行切片,例如:
相关问题 更多 >
编程相关推荐