如何将列表附加到数据帧?

2024-06-24 12:42:59 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试将一个ASCII文件逐行读取到一个数据帧中。你知道吗

我写了以下脚本:

import pandas as pd

col_labels = ['Sg', 'Krg', 'Krw', 'Pc']

df = pd.DataFrame(columns=col_labels)

f = open('EPS.INC', 'r')
for line in f:
    if 'SGWFN' in line:
        print('Reading relative permeability table')
        for line in f:
            line = line.strip()
            if (line.split() and not line.startswith('/') and not line.startswith('--')):
                cols = line.split()
                print(repr(cols))
                df=df.append(cols)

print('Resulting Dataframe')
print(df)

我正在分析的文件如下所示:

SGWFN            

--Facies 1 Drainage SATNUM 1            
--Sg    Krg    Krw    J
0.000000    0.000000    1.000000    0.000000
0.030000    0.000000    0.500000    0.091233
0.040000    0.000518    0.484212    0.093203
0.050000    0.001624    0.468759    0.095237
/

我希望为每个数据帧行添加四个值。相反,它们被添加为列,如下所示:

Resulting Dataframe
      Sg  Krg  Krw   Pc           0
0    NaN  NaN  NaN  NaN    0.000000
1    NaN  NaN  NaN  NaN    0.000000
2    NaN  NaN  NaN  NaN    1.000000
3    NaN  NaN  NaN  NaN    0.000000
4    NaN  NaN  NaN  NaN    0.030000
5    NaN  NaN  NaN  NaN    0.000000
6    NaN  NaN  NaN  NaN    0.500000

有人能解释一下我做错了什么吗?你知道吗

谢谢! D级


Tags: 文件数据indflabelslinecolnan
1条回答
网友
1楼 · 发布于 2024-06-24 12:42:59

我建议创建空列表L,并在循环中追加值,最后一次调用DataFrame构造函数:

L = []
#better for correct close file
with open("EPS.INC") as f:
    for line in f:
        if 'SGWFN' in line:
            print('Reading relative permeability table')
            for line in f:
                line = line.strip()
                if (line.split() and not line.startswith('/') and not line.startswith(' ')):
                    cols = line.split()
                    print(repr(cols))
                    L.append(cols)

print('Resulting Dataframe')
col_labels = ['Sg', 'Krg', 'Krw', 'Pc']

df = pd.DataFrame(L, columns=col_labels)
print(df)
         Sg       Krg       Krw        Pc
0  0.000000  0.000000  1.000000  0.000000
1  0.030000  0.000000  0.500000  0.091233
2  0.040000  0.000518  0.484212  0.093203
3  0.050000  0.001624  0.468759  0.095237

您的解决方案应该通过使用指定索引附加Series来更改:

col_labels = ['Sg', 'Krg', 'Krw', 'Pc']

df = pd.DataFrame()
f = open('EPS.INC', 'r')
for line in f:
    if 'SGWFN' in line:
        print('Reading relative permeability table')
        for line in f:
            line = line.strip()
            if (line.split() and not line.startswith('/') and not line.startswith(' ')):
                cols = line.split()
                print(repr(cols))
                df=df.append(pd.Series(cols, index=col_labels), ignore_index=True)

print('Resulting Dataframe')
print(df)
        Krg       Krw        Pc        Sg
0  0.000000  1.000000  0.000000  0.000000
1  0.000000  0.500000  0.091233  0.030000
2  0.000518  0.484212  0.093203  0.040000
3  0.001624  0.468759  0.095237  0.050000

相关问题 更多 >