Python Pandas和Generator在CSV中处理行

2024-09-28 05:22:30 发布

您现在位置:Python中文网/ 问答频道 /正文

希望这是一个允许的SO问题,但我希望得到一些建议,如何转换下面的代码,处理一个文件中的行,以产生一个数据帧,一个使用生成器和产量,因为这个实现使用列表和附加太慢。在

这是我想出的解决方案,但我真的希望避免使用非常慢的列表和追加操作。我希望有一个很酷的发电机,并产生解决方案,但还不够舒适的工作与发电机。在

文件中的采样线:

"USNC3255","27","US","NC","LANDS   END","72305006","KNJM","KNCA","KNKT","T72305006","","","NCC031","NCZ095","","545","28594","America/New_York","34.65266","-77.07661","7","RDU","893727","
"USNC3256","27","US","NC","LANDSDOWN","72314058","KEHO","KAKH","KIPJ","T72314058","","","NCC045","NCZ068","sc007","517","28150","America/New_York","35.29374","-81.46537","797","CLT","317845","

当前解决方案:

^{pr2}$

输出只是一个包含23列和两行的数据帧,如果用在上面的示例行上。在


Tags: 文件数据代码列表newso解决方案建议
1条回答
网友
1楼 · 发布于 2024-09-28 05:22:30

文件的唯一问题是每行都以,"结尾。这会混淆解析器。如果可以删除尾随的逗号和引号,则可以使用常规解析器。在

import pandas as pd
from StringIO import StringIO
with open('example.txt') as myfile:
    data = myfile.read().replace(',"\n', '\n')
pd.read_csv(StringIO(data), header=None)

我得到的是:

^{pr2}$

相关问题 更多 >

    热门问题