达斯克/Pandas,如何绕过布尔头

2024-09-29 17:44:58 发布

您现在位置:Python中文网/ 问答频道 /正文

在博客文章/教程之后,https://jakevdp.github.io/blog/2015/08/14/out-of-core-dataframes-in-python/我使用了以下代码:

from dask import dataframe as dd
columns = ["name", "amenity", "Longitude", "Latitude"]
data = dd.read_csv('POIWorld.csv', usecols=columns)

我收到以下错误:

^{pr2}$

如何绕过此类型错误,或以正确的格式输入csv?再多一点。。。在

使用:

data = dd.read_csv("POIWorld.csv", usecols=columns, header=None)
data

给了我(如预期的):

dd.DataFrame<read-csv-POIWorld.csv-e5a4ce81b697e4068e03e56e51643bda, divisions=(None, None, None, ..., None, None)>

但接着跑:

with_name = data[data.name.notnull()]
with_amenity = data[data.amenity.notnull()]

退货:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-6-b460952b73e5> in <module>()
----> 1 with_name = data[data.name.notnull()]
      2 with_amenity = data[data.amenity.notnull()]

C:\Anaconda2\lib\site-packages\dask\dataframe\core.pyc in __getattr__(self, key)
   1196                 return self[key]
   1197             except KeyError as e:
-> 1198                 raise AttributeError(e)
   1199 
   1200     def __dir__(self):

AttributeError: 'name'

因此,如果我使用header=None,它当然无法识别“name”头。我该怎么做才能让达斯克识别报头?在


Tags: columnscsvnameincoreselfnoneread

热门问题