在博客文章/教程之后,https://jakevdp.github.io/blog/2015/08/14/out-of-core-dataframes-in-python/我使用了以下代码:
from dask import dataframe as dd
columns = ["name", "amenity", "Longitude", "Latitude"]
data = dd.read_csv('POIWorld.csv', usecols=columns)
我收到以下错误:
^{pr2}$如何绕过此类型错误,或以正确的格式输入csv?再多一点。。。在
使用:
data = dd.read_csv("POIWorld.csv", usecols=columns, header=None)
data
给了我(如预期的):
dd.DataFrame<read-csv-POIWorld.csv-e5a4ce81b697e4068e03e56e51643bda, divisions=(None, None, None, ..., None, None)>
但接着跑:
with_name = data[data.name.notnull()]
with_amenity = data[data.amenity.notnull()]
退货:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-6-b460952b73e5> in <module>()
----> 1 with_name = data[data.name.notnull()]
2 with_amenity = data[data.amenity.notnull()]
C:\Anaconda2\lib\site-packages\dask\dataframe\core.pyc in __getattr__(self, key)
1196 return self[key]
1197 except KeyError as e:
-> 1198 raise AttributeError(e)
1199
1200 def __dir__(self):
AttributeError: 'name'
因此,如果我使用header=None,它当然无法识别“name”头。我该怎么做才能让达斯克识别报头?在
此问题已在开发分支中解决,并将在版本
0.7.6
中修复。在相关问题 更多 >
编程相关推荐