使用pyodbc中的executemany将Dataframe转换为SQL Server

cursor = sql_con.cursor() cursor.fast_executemany = True for row_count in range(0, df.shape[0]): chunk = df.iloc[row_count:row_count + 1,:].values.tolist() tuple_of_tuples = tuple(tuple(x) for x in chunk) for index,row in ProductInventory.iterrows(): cursor.executemany("INSERT INTO table ([x]],[Y]) values (?,?)",tuple_of_tuples)

cursor = sql_con.cursor() for row_count in range(0, ProductInventory.shape[0]): chunk = ProductInventory.iloc[row_count:row_count + 1,:].values.tolist() tuple_of_tuples = tuple(tuple(x) for x in chunk) for index,row in ProductInventory.iterrows(): cursor.executemany(""INSERT INTO table ([x]],[Y]) values (?,?)",tuple_of_tuples

2条回答

网友

1楼 · 编辑于 2024-09-30 18:19:35

Trying to run in SQL azure so SQL Alchemy is not an easy connection method.

也许你只需要先克服这个障碍。然后您可以将pandasto_sql与fast_executemany=True一起使用。比如说

from sqlalchemy import create_engine
#
# ...
#
engine = create_engine(connection_uri, fast_executemany=True)
df.to_sql("table_name", engine, if_exists="append", index=False)

如果您有一个工作的pyodbc连接字符串，则可以将其转换为SQLAlchemy连接URI，如下所示：

connection_uri = 'mssql+pyodbc:///?odbc_connect=' + urllib.parse.quote_plus(connection_string)

网友

2楼 · 编辑于 2024-09-30 18:19:35

有几件事

为什么要迭代ProductInventory两次
executemany调用不应该在构建了整个元组或一批元组之后发生吗
pyodbc文档指出，“使用fast_executemany=False运行executemany（）通常不会比直接运行多个execute（）命令快多少。”因此，您需要在这两个示例中设置cursor.fast_executemany=True（有关更多详细信息/示例，请参见https://github.com/mkleehammer/pyodbc/wiki/Cursor）。我不确定为什么在示例2中省略了它

下面是一个例子，说明你如何完成我认为你正在努力做的事情。math.ceil和end_idx = ...中的条件表达式占最后一批，可能是奇数大小。因此，在下面的示例中，您有10行，批大小为3，因此最终有4个批，最后一个只有1个元组

import math

df = ProductInventory
batch_size = 500
num_batches = math.ceil(len(df)/batch_size)

for i in range(num_batches):
    start_idx = i * batch_size
    end_idx = len(df) if i + 1 == num_batches else start_idx + batch_size
    tuple_of_tuples = tuple(tuple(x) for x in df.iloc[start_idx:end_idx, :].values.tolist())       
    cursor.executemany("INSERT INTO table ([x]],[Y]) values (?,?)", tuple_of_tuples)

示例输出：

=== Executing: ===
df = pd.DataFrame({'a': range(1,11), 'b': range(101,111)})

batch_size = 3
num_batches = math.ceil(len(df)/batch_size)

for i in range(num_batches):
    start_idx = i * batch_size
    end_idx = len(df) if i + 1 == num_batches else start_idx + batch_size
    tuple_of_tuples = tuple(tuple(x) for x in df.iloc[start_idx:end_idx, :].values.tolist())
    print(tuple_of_tuples)

=== Output: ===
((1, 101), (2, 102), (3, 103))
((4, 104), (5, 105), (6, 106))
((7, 107), (8, 108), (9, 109))
((10, 110),)

相关问题更多 >

编程相关推荐

热门问题

热门文章