如何在遍历列时更改dataframe中列的值?

2024-10-17 08:27:17 发布

您现在位置:Python中文网/ 问答频道 /正文

我有这样一个数据帧:

Cause_of_death       famous_for          name         nationality
suicide by hanging   African jazz        XYZ             South
unknown              Korean president    ABC             South
heart attack         businessman         EFG             American
heart failure        Prime Minister      LMN             Indian
heart problems       African writer      PQR             South

数据帧太大了。我想做的是在国籍栏中进行更改。您可以看到,对于国籍=南部,我们将韩国非洲作为著名的列的字符串的一部分。因此,我想做的是将国籍改为南非如果而著名的包含非洲,而国籍如果著名的包含韩国。在

我试过的是:

^{pr2}$

Tags: of数据nameforbysouthnationality国籍
2条回答

您可以使用contains()检查列的著名_是否包括韩国或非洲,并相应地设置国籍。在

df.loc[df.famous_for.str.contains('Korean'), 'nationality']='South Korean'

df.loc[df.famous_for.str.contains('Africa'), 'nationality']='South Africa'

df
Out[783]: 
       Cause_of_death        famous_for  name   nationality
0  suicide by hanging      African jazz   XYZ  South Africa
1             unknown  Korean president   ABC  South Korean
2        heart attack       businessman   EFG      American
3       heart failure    Prime Minister   LMN        Indian
4      heart problems    African writer   PQR  South Africa

或者您可以在一行中使用:

^{pr2}$

如果有许多条件是可能的,请将自定义函数与^{}axis=1一起用于按行处理:

def f(x):
    if (x['nationality']=='South'):
        if 'Korea' in x['famous_for']:
            return 'South Korea'
        elif 'Africa' in x['famous_for']:
            return 'South Africa'
    else:
        return x['nationality']


deaths['nationality'] = deaths.apply(f, axis=1)
print (deaths)
       Cause_of_death        famous_for name   nationality
0  suicide by hanging      African jazz  XYZ  South Africa
1             unknown  Korean president  ABC   South Korea
2        heart attack       businessman  EFG      American
3       heart failure    Prime Minister  LMN        Indian
4      heart problems    African writer  PQR  South Africa

但是如果只有少数条件使用^{}^{}

^{pr2}$

另一个带有^{}的解决方案:

mask1 = deaths['nationality'] == 'South'
mask2 = deaths['famous_for'].str.contains('Korean')
mask3 = deaths['famous_for'].str.contains('Africa')

deaths['nationality'] = deaths['nationality'].mask(mask1 & mask2, 'South Korea')
deaths['nationality'] = deaths['nationality'].mask(mask1 & mask3,'South Africa')
print (deaths)
0  suicide by hanging      African jazz  XYZ  South Africa
1             unknown  Korean president  ABC   South Korea
2        heart attack       businessman  EFG      American
3       heart failure    Prime Minister  LMN        Indian
4      heart problems    African writer  PQR  South Africa

相关问题 更多 >