对两个DataFrame列中的选定项重新排序

data = {'T1': {0: 'Belarus', 1: 'Netherlands', 2: 'France', 3: 'Faroe Islands', 4: 'Hungary'}, 'T2': {0: 'Sweden', 1: 'Bulgaria', 2: 'Luxembourg', 3: 'Andorra', 4: 'Portugal'}, 'score': {0: -4, 1: 2, 2: 0, 3: 1, 4: -1}} df = pd.DataFrame(data) # T1 t2 score #0 Belarus Sweden -4 #1 Netherlands Bulgaria 2 #2 France Luxembourg 0 #3 Faroe Islands Andorra 1 #4 Hungary Portugal -1

df.apply(lambda x: pd.Series([x["T2"], x["T1"], -x["score"]]) if (x["T1"] > x["T2"]) else pd.Series([x["T1"], x["T2"], x["score"]]), axis=1) # 0 1 2 #0 Belarus Sweden -4 #1 Bulgaria Netherlands -2 #2 France Luxembourg 0 #3 Andorra Faroe Islands -1 #4 Hungary Portugal -1

3条回答

网友

1楼 · 编辑于 2024-05-20 00:01:14

不如“cᴏʟᴅsᴘᴇᴇᴅ”的回答那么简洁，但要努力

df1=df[['T1','T2']]
df1.values.sort(1)
df1['new']=np.where((df1!=df[['T1','T2']]).any(1),-df.score,df.score)

df1
Out[102]: 
         T1             T2  new
0   Belarus         Sweden   -4
1  Bulgaria    Netherlands   -2
2    France     Luxembourg    0
3   Andorra  Faroe Islands   -1
4   Hungary       Portugal   -1

网友

2楼 · 编辑于 2024-05-20 00:01:14

这里是一个有趣的和创造性的方式使用numpy工具

t = df[['T1', 'T2']].values
a = t.argsort(1)

df[['T1', 'T2']] = t[np.arange(len(t))[:, None], a]
# @ is python 3.5 thx @cᴏʟᴅsᴘᴇᴇᴅ
# otherwise use
# df['score'] *= a.dot([-1, 1])
df['score'] *= a @ [-1, 1]

df

         T1             T2  score
0   Belarus         Sweden     -4
1  Bulgaria    Netherlands     -2
2    France     Luxembourg      0
3   Andorra  Faroe Islands     -1
4   Hungary       Portugal     -1

网友

3楼 · 编辑于 2024-05-20 00:01:14

选项1
布尔索引。你知道吗

m = df.T1 > df.T2
m 

0    False
1     True
2    False
3     True
4    False
dtype: bool

df.loc[m, 'score'] = df.loc[m, 'score'].mul(-1)
df.loc[m, ['T1', 'T2']] = df.loc[m, ['T2', 'T1']].values
df

         T1             T2  score
0   Belarus         Sweden     -4
1  Bulgaria    Netherlands     -2
2    France     Luxembourg      0
3   Andorra  Faroe Islands     -1
4   Hungary       Portugal     -1

选项2
df.eval

m = df.eval('T1 > T2')
df.loc[m, 'score'] = df.loc[m, 'score'].mul(-1)
df.loc[m, ['T1', 'T2']] = df.loc[m, ['T2', 'T1']].values
df

         T1             T2  score
0   Belarus         Sweden     -4
1  Bulgaria    Netherlands     -2
2    France     Luxembourg      0
3   Andorra  Faroe Islands     -1
4   Hungary       Portugal     -1

选项3
df.query

idx = df.query('T1 > T2').index
idx
Int64Index([1, 3], dtype='int64')

df.loc[idx, 'score'] = df.loc[idx, 'score'].mul(-1)
df.loc[idx, ['T1', 'T2']] = df.loc[idx, ['T2', 'T1']].values
df

         T1             T2  score
0   Belarus         Sweden     -4
1  Bulgaria    Netherlands     -2
2    France     Luxembourg      0
3   Andorra  Faroe Islands     -1
4   Hungary       Portugal     -1

相关问题更多 >

编程相关推荐

热门问题

热门文章