AttributeError:“Series”对象没有属性“notna”

2024-09-20 03:32:32 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个包含空字符串的多列csv文件。将csv读入pandas数据帧后,空字符串将转换为NaN

现在我想把一个字符串tag-附加到列中已经存在的字符串,但是只附加到那些在其中有一些值的字符串,而不是附加到那些具有NaN

这就是我想做的:

with open('file1.csv','r') as file:
    for chunk in pd.read_csv(file,chunksize=1000, header=0, names=['A','B','C','D'])
        if len(chunk) >=1:
            if chunk['A'].notna:
                chunk['A'] = "tag-"+chunk['A'].astype(str)
            if chunk['B'].notna:
                chunk['B'] = "tag-"+chunk['B'].astype(str)
            if chunk['C'].notna:
                chunk['C'] = "tag-"+chunk['C'].astype(str)
            if chunk['D'].notna:
                chunk['D'] = "tag-"+chunk['D'].astype(str)

这就是我得到的错误:

AttributeError: 'Series' object has no attribute 'notna'

我想要的最终输出应该是这样的:

A,B,C,D
tag-a,tab-b,tag-c,
tag-a,tag-b,,
tag-a,,,
,,tag-c,
,,,tag-d
,tag-b,,tag-d

Tags: 文件csv数据字符串pandasiftagwith
1条回答
网友
1楼 · 发布于 2024-09-20 03:32:32

我相信您需要^{}tag-添加到所有列中:

for chunk in pd.read_csv('file1.csv',chunksize=2, header=0, names=['A','B','C','D']):
    if len(chunk) >=1:
        m1 = chunk.notna()
        chunk = chunk.mask(m1, "tag-" + chunk.astype(str))

你需要升级到熊猫的最新版本,0.21.0

您可以检查docs

In order to promote more consistency among the pandas API, we have added additional top-level functions isna() and notna() that are aliases for isnull() and notnull(). The naming scheme is now more consistent with methods like .dropna() and .fillna(). Furthermore in all cases where .isnull() and .notnull() methods are defined, these have additional methods named .isna() and .notna(), these are included for classes Categorical, Index, Series, and DataFrame. (GH15001).

The configuration option pd.options.mode.use_inf_as_null is deprecated, and pd.options.mode.use_inf_as_na is added as a replacement.

相关问题 更多 >