无法将简单文本文件转换为datafram

2024-10-01 11:23:17 发布

您现在位置:Python中文网/ 问答频道 /正文

我的文件是这样的:

raw_file-->

'Date\tValue\tSeries\tLabel\n07/01/2007\t687392\t31537611\tThis home\n08/01/2007\t750624\t31537611\tThis home\n09/01/2007\t769358\t31537611\tThis home\n10/01/2007\t802014\t31537611\tThis home\n11/01/2007\t815973\t31537611\tThis home\n12/01/2007\t806853\t31537611\tThis home\n01/01/2008\t836318\t31537611\tThis home\n02/01/2008\t856792\t31537611\tThis home\n03/01/2008\t854411\t31537611\tThis home\n04/01/2008\t826354\t31537611\tThis home\n05/01/2008\t789017\t31537611\tThis home\n06/01/2008\t754162\t31537611\tThis home\n07/01/2008\t749522\t31537611\tThis home\n08/01/2008\t757577\t31537611\tThis home\n'

type(raw_file)-->;<type 'str'>

出于某种原因,I can't use pd.read_csv(raw_file)或者我会得到错误:

File "pandas\_libs\parsers.pyx", line 710, in pandas._libs.parsers.TextReader._setup_parser_source (pandas\_libs\parsers.c:8873)
IOError: File Date  Value   Series  Label
07/01/2007  687392  31537611    This home
08/01/2007  750624  31537611    This home
does not exist

我能想到的最好的办法是:

for row in raw_file.split('\n'):
   print(row.split('\t'))

这很慢。有更好的办法吗?你知道吗


Tags: ingtpandashomedaterawtypelibs
2条回答

当你给熊猫一个string作为filepath_or_buffer参数时,它认为它是一个文件名或URL。你知道吗

docs

filepath_or_buffer : str, pathlib.Path, py._path.local.LocalPath or any object with a read() method (such as a file handle or StringIO)

The string could be a URL. Valid URL schemes include http, ftp, s3, and file. For file URLs, a host is expected. For instance, a local

file could be file ://localhost/path/to/table.csv

解决方案:使用io.StringIO()构造函数:

In [69]: pd.read_csv(io.StringIO(raw_file), delim_whitespace=True)
Out[69]:
              Date     Value Series Label
07/01/2007  687392  31537611   This  home
08/01/2007  750624  31537611   This  home
09/01/2007  769358  31537611   This  home
10/01/2007  802014  31537611   This  home
11/01/2007  815973  31537611   This  home
12/01/2007  806853  31537611   This  home
01/01/2008  836318  31537611   This  home
02/01/2008  856792  31537611   This  home
03/01/2008  854411  31537611   This  home
04/01/2008  826354  31537611   This  home
05/01/2008  789017  31537611   This  home
06/01/2008  754162  31537611   This  home
07/01/2008  749522  31537611   This  home
08/01/2008  757577  31537611   This  home

为什么不使用csv模块并将分隔符设置为\t?你知道吗

https://docs.python.org/3.4/library/csv.html

与csv.reader文件(您的\u文件,分隔符='\t')作为f: #做些事情

相关问题 更多 >