动态拆分数据帧

Lags Rep 1 Rep 2 Rep 3 12.500000000E-9 7671.039418 6605.763724 10144.873125 25.000000000E-9 -1.000000 -0.479659 1.454251 37.500000000E-9 31.978402 23.456005 29.678136 50.000000000E-9 5.315013 4.723746 0.227125 62.500000000E-9 1.806673 2.642384 2.681376 75.000000000E-9 NaN NaN NaN 83.500000000E-9 NaN NaN NaN Time PhtA count 1 PhtA count 2 PhtA count 3 0.000000000E+0 42.743683 10.890961 12.454987 2.428800000E-3 14.533997 8.125305 7.534027 4.857600000E-3 8.621216 7.686615 7.133484 7.286400000E-3 5.779266 10.147095 6.561279 9.715200000E-3 6.046295 8.201599 5.187988 12.144000000E-3 5.226135 7.343292 5.855560 Time PhtB count 1 PhtB count 2 PhtB count 3 0.860800000E-3 12.626648 13.580322 8.220673 1.289600000E-3 10.814667 21.381378 7.038116 2.718400000E-3 7.915497 17.261505 7.648468 3.147200000E-3 9.403229 21.266937 10.013580

Lags Rep 1 Rep 2 Rep 3 12.500000000E-9 7671.039418 6605.763724 10144.873125 25.000000000E-9 -1.000000 -0.479659 1.454251 37.500000000E-9 31.978402 23.456005 29.678136 50.000000000E-9 5.315013 4.723746 0.227125 62.500000000E-9 1.806673 2.642384 2.681376

Time PhtA count 1 PhtA count 2 PhtA count 3 0.000000000E+0 42.743683 10.890961 12.454987 2.428800000E-3 14.533997 8.125305 7.534027 4.857600000E-3 8.621216 7.686615 7.133484 7.286400000E-3 5.779266 10.147095 6.561279 9.715200000E-3 6.046295 8.201599 5.187988 12.144000000E-3 5.226135 7.343292 5.855560

Time PhtB count 1 PhtB count 2 PhtB count 3 0.860800000E-3 12.626648 13.580322 8.220673 1.289600000E-3 10.814667 21.381378 7.038116 2.718400000E-3 7.915497 17.261505 7.648468 3.147200000E-3 9.403229 21.266937 10.013580

1条回答

网友

1楼 · 发布于 2024-09-28 17:01:21

首先将所有数据读入保留空行的df，然后在这些空行处拆分并转换为数字：

df = pd.read_csv('data.csv', sep='\s{2,}', skip_blank_lines=False, engine='python')
x = df[df.Lags.isnull()==True].index.values

df1 = df[0:x[0]].dropna().apply(pd.to_numeric)

df2 = df[x[0]+2:x[1]].apply(pd.to_numeric)
df2.columns=df.iloc[x[0]+1].values

df3 = df[x[1]+2:].apply(pd.to_numeric)
df3.columns = df.iloc[x[1]+1].values

print(df1);print(df2); print(df3)的输出：

           Lags        Rep 1        Rep 2         Rep 3
0  1.250000e-08  7671.039418  6605.763724  10144.873125
1  2.500000e-08    -1.000000    -0.479659      1.454251
2  3.750000e-08    31.978402    23.456005     29.678136
3  5.000000e-08     5.315013     4.723746      0.227125
4  6.250000e-08     1.806673     2.642384      2.681376
        Time  PhtA count 1  PhtA count 2  PhtA count 3
9   0.000000     42.743683     10.890961     12.454987
10  0.002429     14.533997      8.125305      7.534027
11  0.004858      8.621216      7.686615      7.133484
12  0.007286      5.779266     10.147095      6.561279
13  0.009715      6.046295      8.201599      5.187988
14  0.012144      5.226135      7.343292      5.855560
        Time  PhtB count 1  PhtB count 2  PhtB count 3
17  0.000861     12.626648     13.580322      8.220673
18  0.001290     10.814667     21.381378      7.038116
19  0.002718      7.915497     17.261505      7.648468
20  0.003147      9.403229     21.266937     10.013580

好处：csv中任意数量的数据块的通用解决方案，以空行分隔（它们的数量不需要事先知道）：

df = pd.read_csv('data.csv', sep='\s{2,}', skip_blank_lines=False, engine='python', header=None)
x = [-1] + list(df[df.iloc[:,0].isnull()==True].index.values) + [len(df)]
for i in range(1,len(x)):
     globals()[f'df{i}'] = df[x[i-1]+2:x[i]].dropna().apply(pd.to_numeric)
     globals()[f'df{i}'].columns = df.iloc[x[i-1]+1].values

相关问题更多 >

编程相关推荐

热门问题

热门文章