我有一个数据框,如下所示:
PR Order Season Rj
0 3001913971 3445046069 202112 NaN
1 3002026058 1445132366 202121 NaN
2 3002026059 1445132365 202122 NaN
3 3002026063 1445132367 202211 NaN
4 3002026069 1445132375 202121 NaN
当我第一次运行下面的代码时,它工作得很好
df['Season'] = df['Season'].astype(str)
df.loc[(df['Season'].str[-2:] == '11') & (df['Season'].str.len() == 6),'Season'] = 'Spring ' + df.loc[df['Season'].str[-2:] == '11','Season'].str[:4]
df.loc[(df['Season'].str[-2:] == '12') & (df['Season'].str.len() == 6),'Season'] = 'Summer ' + df.loc[df['Season'].str[-2:] == '12','Season'].str[:4]
df.loc[(df['Season'].str[-2:] == '21') & (df['Season'].str.len() == 6),'Season'] = 'Autumn ' + df.loc[df['Season'].str[-2:] == '21','Season'].str[:4]
df.loc[(df['Season'].str[-2:] == '22') & (df['Season'].str.len() == 6),'Season'] = 'Holiday ' + df.loc[df['Season'].str[-2:] == '22','Season'].str[:4]
第一次运行的结果如下所示
PR Order Season Rj
0 3001913971 3445046069 Summer 2021 NaN
1 3002026058 1445132366 Autumn 2021 NaN
2 3002026059 1445132365 Holiday 2021 NaN
3 3002026063 1445132367 Spring 2022 NaN
4 3002026069 1445132375 Autumn 2021 NaN
但当我第二次运行它时,它将引发错误
ValueError: Must have equal len keys and value when setting with an iterable
你知道为什么吗?非常感谢
第二次运行代码时,
Season
字符串的长度不再是6(并且没有一个字符串的11
作为最后两个字母),因此代码的第二行应该将字符串'Spring '
分配给数据帧的空片段,这当然是不可能的通常,在提取这样的数据时,最好保留原始列并将派生值作为附加列添加。这避免了上述问题,也有助于捕获错误。冗余可能是一件好事。顺便说一句,您也可以直接从整数值中提取数据,而无需先将它们转换为字符串。地板分割和模运算符是您所需要的全部:
相关问题 更多 >
编程相关推荐