数据帧读取困难

[8:3:1978] LOG [Sale:internals.py:makeSaleEntry:0] Entered with productid= 2327, storeid= 146, No.OfUnits= 1 [19:1:2007] LOG [Sale:internals.py:makeSaleEntry:1] Entered with productid= 1908, storeid= 202, No.OfUnits= 11 [22:4:2001] LOG [Sale:internals.py:makeSaleEntry:2] Entered with productid= 3072, storeid= 185, No.OfUnits= 16 [22:12:1915] LOG [Sale:internals.py:makeSaleEntry:3] Entered with productid= 1355, storeid= 177, No.OfUnits= 1 [19:8:1963] LOG [Sale:internals.py:makeSaleEntry:4] Entered with productid= 2235, storeid= 35, No.OfUnits= 16 [16:11:1997] LOG [Sale:internals.py:makeSaleEntry:5] Entered with productid= 1439, storeid= 141, No.OfUnits= 26

for i, row in df.iterrows(): strr = "" for j, column in row.iteritems(): seq = column.split('= ') strr = strr + seq[1] + "," file = open("a.csv", "a") file.write(strr[:-1]+"\n") file.close()

3条回答

网友

1楼 · 编辑于 2024-10-01 00:15:10

您的代码忽略了第一行，因为默认情况下read_csv假定它是头。您可以通过添加上面建议的header=None使原始代码正常工作。您可能还需要考虑使用正则表达式来提取值的更可读的版本

df = pd.read_csv('a.txt', header=None)
df['productid'] = df[0].str.findall('productid= ([0-9]+)').apply(lambda l: l[0])
df['storeid'] = df[1].str.findall('storeid= ([0-9]+)').apply(lambda l: l[0])
df['No.OfUnits'] = df[2].str.findall('No.OfUnits= ([0-9]+)').apply(lambda l: l[0])
df1 = df.loc[:, ['productid', 'storeid', 'No.OfUnits']]
df1.to_csv('a.csv', header=False, index=False, mode='a')

顺便说一句，熊猫并不是真正必要的。这也会起作用：

import re
with open('a.txt') as f:
    values = [re.findall('productid= ([0-9]+), storeid= ([0-9]+), No.OfUnits= ([0-9]+)', 
                        line)[0] for line in f]
with open('a.csv', 'a') as f:
    for v in values:
        f.write(','.join(v) + '\n')

网友

2楼 · 编辑于 2024-10-01 00:15:10

Add header=None参数

df = pd.read_csv('a.txt', header=None)

网友

3楼 · 编辑于 2024-10-01 00:15:10

解决了使用元组读取文件时的问题

lines = tuple(open('a.txt', 'r'))
for line in lines:
    file = open("a.csv", "a")
    strr = line.split()
    file.write(strr[len(strr)-5] + strr[len(strr)-3] + strr[len(strr)-1] + "\n")
    file.close()

相关问题更多 >

编程相关推荐

热门问题

热门文章