psycopg2从Redshift获取数据时性能不佳

2024-09-28 17:21:44 发布

男 | 程序猿一只，喜欢编程写python代码。

我尝试使用psycopg2从redshift检索数据到python。不知何故，在python服务器上加载一个35GB的数据库需要很长时间（40分钟）。在

import psycopg2
con = psycopg2.connect(db_connection_info)

query_cursor = con.cursor('query_cursor')
query_cursor.execute('my_query')

stop = 0
batch = 0

print('Starting to retrieve data')

while stop == 0:
    tmp = query_cursor.fetchmany(10000)

    if len(tmp) < 1:
        stop = 1
    else:
        if batch % 100 == 0:
            print(str(batch*10000) + ' rows loaded')
        if batch == 0:
            data = tmp
        else:
            data = data + tmp
    batch = batch + 1

print('Transfering data to dataframe')

df = pd.DataFrame.from_records(data, columns = manually_selected_features, coerce_float = True)

我没有用pd.read_sql语句因为出于RAM的原因，我需要使用服务器端游标。在

我不明白为什么fetchmany的第一次迭代与其他迭代相比花费了很长时间。在

有什么好方法可以加快我的查询速度吗？在

谨致问候，泽维尔

Tags： to redshift data if batch query con cursor

0条回答

目前没有回答

psycopg2从Redshift获取数据时性能不佳

相关问题更多 >

编程相关推荐

热门问题

热门文章

psycopg2从Redshift获取数据时性能不佳

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >