如何使用Pandas清理csv中间的额外标题信息

2条回答

网友

1楼 · 编辑于 2024-05-17 06:58:35

尝试在第一个数据块之后分割文本文件。然后，您可以从中生成两个数据帧并将它们连接起来

with open("yourfile.txt", 'r') as f:
    content = f.read()

# Make a list of subcontent
splitContent = content.split('Results Generated Date Time\nSampling Info\n')

使用“Results Generated Date Time \n采样信息\n”作为拆分参数，也会删除这些行-这仅在不必要的标题行始终相等时才有效

在此之后，您将获得一个数据列表，该列表以字符串（变量：splitContent）的形式显示，由分隔符（“；”）分隔。使用此答案从以下字符串创建数据帧：https://stackoverflow.com/a/22605281/11005812

另一种方法是将每个子项保存为自己的文件并再次读取

关联数据帧：https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html

网友

2楼 · 编辑于 2024-05-17 06:58:35

import pandas as pd

filename = 'filename.csv'
lines =open(filename).read().split('\n')   # reading the csv file

list_ = [e for e in lines if e!='' ]  # removing '' characters from lines list

list_ = [e for e in list_ if e[0].isdigit()]  # removing string starting with non-numeric characters 

Time = [float(i.split(';')[0]) for i in list_]   # use int or float depending upon the requirements

Data = [float(i.split(';')[1].strip()) for i in list_]


df = pd.DataFrame({'Time':Time, 'Data':Data})    #making the dataframe 
df

我希望这将做的工作

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何使用Pandas清理csv中间的额外标题信息

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >