我编写了一个简单的脚本,打算将mongodb数据迁移到postgresql jsonb。附件是简化的脚本
import json
import psycopg2
from pymongo import MongoClient
from decimal import Decimal
from bson.decimal128 import Decimal128
from bson.objectid import ObjectId
from datetime import datetime
import time
class JSONEncoder(json.JSONEncoder):
def default(self, o):
if isinstance(o, ObjectId):
return str(o)
if isinstance(o, Decimal):
return float(o)
if isinstance(o, Decimal128):
return float(str(o))
if isinstance(o, datetime):
return str(o)
return json.JSONEncoder.default(self, o)
client = MongoClient("mongodb+srv://myserver.mongodb.net/mydb")
db=client.mydb
col=db.mycollection
mdbcur=col.find(no_cursor_timeout=True)
with psycopg2.connect(host="localhost",
database="pgdb",
user="postgres",
password="mypassword") as pgconn:
pgcur=pgconn.cursor()
for doc in mdbcur:
data=json.dumps(doc, cls=JSONEncoder)
query_sql = """insert into tbl (data) VALUES ('""" + data + """'::jsonb)"""
pgcur.execute(query_sql)
pgconn.commit()
time.sleep(1)
pgcur.close()
mdbcur.close()
脚本运行得很好,但有>;200万条记录,通过某些方式总是因错误而中断:-
CursorNotFound: cursor id 6836832851964097072 not found, full error: {'operationTime': Timestamp(1618918232, 1), 'ok': 0.0, 'errmsg': 'cursor id 6836832851964097072 not found', 'code': 43, 'codeName': 'CursorNotFound', '$clusterTime': {'clusterTime': Timestamp(1618918232, 1), 'signature': {'hash': b'\x08\x0e8\xff\x89\x114s\x9c:\x82z\xb0j\x05\xe9\xacv\x1c4', 'keyId': 6897522357011742722}}}
尝试添加time.sleep,添加no\u cursor\u timeout,它总是在20000条记录之后的某个地方失败(不修复哪个记录),没有数据从mongodb中删除,但总是有新记录
有什么建议吗
目前没有回答
相关问题 更多 >
编程相关推荐