串联重复的大数据帧：MemoryE

2024-05-19 08:11:32 发布

您现在位置：Python中文网/ 问答频道 /正文

3277

网友

男 | 程序猿一只，喜欢编程写python代码。

后续行动：How can I reference the key in the Pandas dataframes within that dictionary?

我们的目标仍然是按财政年度预测收入，我将根据每年的收入分成一个新的栏目。我有一些代码（加上一些帮助）可以将几个数据帧拉到一个数据帧中，使用一个我放在其中的字典，除了财政年度列之外，其他都是重复的。然后将这些数据帧连接成一个数据帧。你知道吗

我将代码简化为：

import pandas as pd
columns = ['ID','Revenue','Fiscal Year']
ID = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Revenue = [1000, 1200, 1300, 100 ,500, 0, 800, 950, 4321, 800]
FY = []
d = {'ID': ID, 'Revenue': Revenue}
df = pd.DataFrame(d)
df['Fiscal Year'] = ''

def df_dict_func(start, end, dataframe):
    date_range = range(start, end + 1)
    dataframe_dict = {}
    for n in date_range:
        sub = dataframe.copy()
        sub['Fiscal Year'] = n
        dataframe_dict[n] = sub
    return dataframe_dict    

df_dict = df_dict_func(2019, 2035, df)
df = pd.concat(df_dict)

该代码对于较小的数据集非常适用，但是当我将其扩展到较大的数据集时，会收到一个MemoryError。有没有更有效的方法来复制代码的结果，同时避免内存错误问题？你知道吗

我得到的错误是明确的“MemoryError”，它发生在我收到任何结果之前pd.concat公司命令。字典中的每个数据帧的大小都很大（超过500MB）。你知道吗

Tags： the 数据代码 in id dataframe df 字典

0条回答

目前没有回答

串联重复的大数据帧：MemoryE

相关问题更多 >

编程相关推荐

热门问题

热门文章

串联重复的大数据帧：MemoryE

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >