非常大的writeintentive MySQL imp

2024-10-04 05:20:40 发布

男 | 程序猿一只，喜欢编程写python代码。

我有（我会考虑的）大量的纯文本文件，大约400GB，正在导入MySQL数据库（InnoDB引擎）。txt文件的大小从2GB到26GB不等，每个文件代表数据库中的一个表。我得到了一个Python脚本，它解析.txt文件并构建SQL语句。我有一台专门用于此任务的机器，规格如下：

操作系统-Windows 10
32GB内存
4TB硬盘
i7 3.40 GHz处理器

我想优化这个进口尽可能快，脏。我在MySQL中更改了以下配置设置我的.ini基于stack O questions、MySQL docs和other sources的文件：

max_allowed_packet=1073741824;

autocommit=0;

net_buffer_length=0;

foreign_key_check=0;

unique_checks=0;

innodb_buffer_pool_size=8G; (this made a big difference in speed when I increased from the default of 128M)

在配置文件中是否有我遗漏的其他设置（可能是关于日志记录或缓存的设置）可以指导MySQL使用机器的大部分资源？我会错过另一个瓶颈吗？你知道吗

（旁注：不确定这是否相关-当我开始导入时，mysqld进程会旋转起来，使用大约13-15%的系统内存，但是当我停止Python脚本继续导入时，似乎从来不会清除它。我想知道这是不是因为搞乱了日志和刷新设置。提前感谢您的帮助。）

（编辑）

下面是填充表的Python脚本的相关部分。对于每50000条记录，脚本似乎正在连接、提交和关闭连接。我可以删除函数末尾的conn.commit()并让MySQL处理提交吗？while (true)下面的注释来自脚本的作者，我已经调整了这个数字，以便它不会超过允许的最大数据包大小。你知道吗

    conn = self.connect()

    while (True):
        #By default, we concatenate 200 inserts into a single INSERT statement.
        #a large batch size per insert improves performance, until you start hitting max_packet_size issues.
        #If you increase MySQL server's max_packet_size, you may get increased performance by increasing maxNum
        records = self.parser.nextRecords(maxNum=50000)
        if (not records):
            break

        escapedRecords = self._escapeRecords(records) #This will sanitize the records
        stringList = ["(%s)" % (", ".join(aRecord)) for aRecord in escapedRecords]

        cur = conn.cursor()
        colVals = unicode(", ".join(stringList), 'utf-8')
        exStr = exStrTemplate % (commandString, ignoreString, tableName, colNamesStr, colVals)
        #unquote NULLs
        exStr = exStr.replace("'NULL'", "NULL")
        exStr = exStr.replace("'null'", "NULL")

        try:
            cur.execute(exStr)
        except MySQLdb.Warning, e:
            LOGGER.warning(str(e))
        except MySQLdb.IntegrityError, e:
        #This is likely a primary key constraint violation; should only be hit if skipKeyViolators is False
            LOGGER.error("Error %d: %s", e.args[0], e.args[1])
        self.lastRecordIngested = self.parser.latestRecordNum
        recCheck = self._checkProgress()
        if recCheck:
            LOGGER.info("...at record %i...", recCheck)
    conn.commit()
    conn.close()

Tags：文件 self 脚本 you size if packet mysql

0条回答

目前没有回答

非常大的writeintentive MySQL imp

相关问题更多 >

编程相关推荐

热门问题

热门文章

非常大的writeintentive MySQL imp

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >