加快到sql表的速度

2024-09-28 01:33:54 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个熊猫df,看起来像这样:

Datum      Kasse Bon Articles
2019-05-01 101   1   Oranges
2019-05-01 101   2   Apples
2019-05-01 101   3   Banana

基本上是四列(日期、smallint、smallint、字符串)。 它包含数百万行

我正在使用以下代码将其导入数据库

query_insert_tmp = """
    INSERT INTO <my_table>
    VALUES ('{}',{},{},'{}')
    """

for index, r in edited_df.iterrows():
    cursor.execute(query_insert_tmp.format(r[0],r[1],r[2],r[3]))

它工作正常,但需要很长时间。 你们知道其他加快速度的方法吗


Tags: 字符串代码数据库dfqueryarticlestmpbanana
2条回答

考虑^ {CD1>}+^ {CD2>}和^ {CD3>},^ {< CD4>}:

query_insert_tmp = """
    INSERT INTO <my_table>
    VALUES (?, ?, ?, ?)
    """

sql_data = edited_df.to_numpy().tolist()

cursor.executemany(query_insert_tmp, sql_data)

我会这样做的

from datetime import datetime

import pandas as pd

import pyodbc
from fast_to_sql import fast_to_sql as fts

# Test Dataframe for insertion
df = pd.DataFrame({
    "Col1": [1, 2, 3],
    "Col2": ["A", "B", "C"],
    "Col3": [True, False, True],
    "Col4": [datetime(2020,1,1),datetime(2020,1,2),datetime(2020,1,3)]
})

# Create a pyodbc connection
conn = pyodbc.connect(
    """
    Driver={ODBC Driver 17 for SQL Server};
    Server=localhost;
    Database=my_database;
    UID=my_user;
    PWD=my_pass;
    """
)

# If a table is created, the generated sql is returned
create_statement = fts.fast_to_sql(df, "my_great_table", conn, if_exists="replace", custom={"Col1":"INT PRIMARY KEY"}, temp=False)

# Commit upload actions and close connection
conn.commit()
conn.close()

主要功能

fts.fast_to_sql(df, name, conn, if_exists="append", custom=None, temp=False)

或者,试试这个

import pyodbc

engine = "mssql+pyodbc://server_name/db_name?driver=SQL Server Native Client 11.0?trusted_connection=yes"

# your dataframe is here
df.to_sql(name_of_dataframe, engine, if_exists='append', index=True, chunksize=100000)

https://pypi.org/project/fast-to-sql/

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_sql.html

相关问题 更多 >

    热门问题