带Python Pandas字符串和浮动的Read.txt文件

2024-09-29 21:28:15 发布

您现在位置:Python中文网/ 问答频道 /正文

我想用Pandas阅读Python(3.6.0)中的.txt文件。.txt文件的第一行如下所示:

要读取的文本文件

Location:           XXX
Campaign Name:      XXX
Date of log start:  2016_10_09
Time of log start:  04:27:28
Sampling Frequency: 1Hz
Config file:        XXX
Logger Serial:      XXX

CH Mapping;;XXXC1;XXXC2;XXXC3;XXXC4
CH Offsets in ms;;X;X,X;X;X,X
CH Units;;mA;mA;mA;mA
Time;msec;Channel1;Channel2;Channel3;Channel4
04:30:00;000; 0.01526;10.67903;10.58366; 0.00000
04:30:01;000; 0.17090;10.68666;10.58518; 0.00000
04:30:02;000; 0.25177;10.68284;10.58442; 0.00000

我使用下面的简单代码行:

Python代码

^{pr2}$

然后在终端得到以下输出:

终端输出

    Time msec Channel1 Channel2 Channel3 Channel4
0    NaN  NaN      NaN      NaN      NaN      NaN
1    NaN  NaN      NaN      NaN      NaN      NaN
2    NaN  NaN      NaN      NaN      NaN      NaN
..   ...  ...      ...      ...      ...      ...
599  NaN  NaN      NaN      NaN      NaN      NaN

我的第一反应是熊猫不喜欢前两个专栏。你有什么建议可以让Pandas在不改变文件本身的情况下读取.txt文件吗。在

提前谢谢你。在


Tags: 文件oftxtlogpandastimenanch
1条回答
网友
1楼 · 发布于 2024-09-29 21:28:15

您希望将skiprows=11skipinitial_space=Truesep=';'一起传递给read_csv,因为分隔符中有空格:

In [83]:
import io
import pandas as pd
t="""Location:           XXX
Campaign Name:      XXX
Date of log start:  2016_10_09
Time of log start:  04:27:28
Sampling Frequency: 1Hz
Config file:        XXX
Logger Serial:      XXX
​
CH Mapping;;XXXC1;XXXC2;XXXC3;XXXC4
CH Offsets in ms;;X;X,X;X;X,X
CH Units;;mA;mA;mA;mA
Time;msec;Channel1;Channel2;Channel3;Channel4
04:30:00;000; 0.01526;10.67903;10.58366; 0.00000
04:30:01;000; 0.17090;10.68666;10.58518; 0.00000
04:30:02;000; 0.25177;10.68284;10.58442; 0.00000"""
​
df = pd.read_csv(io.StringIO(t), skiprows=11, sep=';', skipinitialspace=True)
df

Out[83]:
       Time  msec  Channel1  Channel2  Channel3  Channel4
0  04:30:00     0   0.01526  10.67903  10.58366       0.0
1  04:30:01     0   0.17090  10.68666  10.58518       0.0
2  04:30:02     0   0.25177  10.68284  10.58442       0.0

您可以看到数据类型现在是正确的:

^{pr2}$

您还可以选择将时间解析为日期时间:

In [86]:    
df = pd.read_csv(io.StringIO(t), skiprows=11, sep=';', skipinitialspace=True, parse_dates=['Time'])
df

Out[86]:
                 Time  msec  Channel1  Channel2  Channel3  Channel4
0 2017-03-16 04:30:00     0   0.01526  10.67903  10.58366       0.0
1 2017-03-16 04:30:01     0   0.17090  10.68666  10.58518       0.0
2 2017-03-16 04:30:02     0   0.25177  10.68284  10.58442       0.0

In [87]:    
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 6 columns):
Time        3 non-null datetime64[ns]
msec        3 non-null int64
Channel1    3 non-null float64
Channel2    3 non-null float64
Channel3    3 non-null float64
Channel4    3 non-null float64
dtypes: datetime64[ns](1), float64(4), int64(1)
memory usage: 224.0 bytes

相关问题 更多 >

    热门问题