根据条件将行值替换为同一df中的其他行值

2024-10-01 07:48:46 发布

您现在位置:Python中文网/ 问答频道 /正文

我有以下数据集:

df = pd.DataFrame( {'user': {0: 1, 1: 1, 2: 1, 3: 2, 4: 2, 5: 2, 6: 2}, 
    'date': {0: '1995-09-01', 1: '1995-09-02', 2: '1995-10-03', 3: '1995-10-04', 4: '1995-10-05', 5: '1995-11-07', 6: '1995-11-08'}, 
    'x': {0: '1995-09-02', 1: '1995-09-02', 2: '1995-09-02', 3: '1995-10-05', 4: '1995-10-05', 5: '1995-10-05', 6: '1995-10-05'}, 
    'y': {0: '1995-10-03', 1: '1995-10-03', 2: '1995-10-03', 3: '1995-11-08', 4: '1995-11-08', 5: '1995-11-08', 6: '1995-11-08'}, 
    'c1': {0: '1', 1: '0', 2: '0', 3: '2', 4: '0', 5: '9', 6: '0'}, 
    'c2': {0: '1', 1: '0', 2: '0', 3: '2', 4: '0', 5: '9', 6: '0'}, 
    'c3': {0: '1', 1: '0', 2: '0', 3: '2', 4: '0', 5: '9', 6: '0'}, 
    'VTX1': {0: 1, 1: 0, 2: 0, 3: 1, 4: 0, 5: 0, 6: 0}, 
    'VTY1': {0: 0, 1: 1, 2: 0, 3: 0, 4: 0, 5: 1, 6: 0}} )

这给了我:

    user    date         x           y     c1   c2 c3 VTX1 VTY1
0   1   1995-09-01  1995-09-02  1995-10-03  1   1   1   1   0
1   1   1995-09-02  1995-09-02  1995-10-03  0   0   0   0   1
2   1   1995-10-03  1995-09-02  1995-10-03  0   0   0   0   0
3   2   1995-10-04  1995-10-05  1995-11-08  2   2   2   1   0
4   2   1995-10-05  1995-10-05  1995-11-08  0   0   0   0   0
5   2   1995-11-07  1995-10-05  1995-11-08  9   9   9   0   1
6   2   1995-11-08  1995-10-05  1995-11-08  0   0   0   0   0

我想如下替换df['c1']

- When df[‘date’]=df[‘x’], 
       change df[‘c1’] for the df[‘c1’] value when df[‘VTX1’]=1
    

在本例中,对于用户1,当df['date']=df['x']时,它恰好位于索引1上。这里我们希望df['c1']为1。注意,当df['VTX1']=1时,1是用户1在df['c1']上的值

因此,最终结果将是:

   user    date          x         y       c1   c2 c3  VTX1 VTY1
0   1   1995-09-01  1995-09-02  1995-10-03  1   1   1   1   0
1   1   1995-09-02  1995-09-02  1995-10-03  0   0   0   0   1
2   1   1995-10-03  1995-09-02  1995-10-03  0   0   0   0   0
3   2   1995-10-04  1995-10-05  1995-11-08  2   2   2   1   0
4   2   1995-10-05  1995-10-05  1995-11-08  2   0   0   0   0
5   2   1995-11-07  1995-10-05  1995-11-08  9   9   9   0   1
6   2   1995-11-08  1995-10-05  1995-11-08  0   0   0   0   0

Tags: 数据用户dataframedffordatechangepd
1条回答
网友
1楼 · 发布于 2024-10-01 07:48:46

对于每个唯一的用户,选择列VTX1具有值1的行,这可以通过将索引设置为user并使用query选择所需的行来完成。然后mask其中date等于xc1中的值,并使用映射序列d替换屏蔽值

d = df.set_index('user').query('VTX1 == 1')['c1']
df['c1'] = df['c1'].mask(df['date'].eq(df['x']), df['user'].map(d))

   user        date           x           y c1 c2 c3  VTX1  VTY1
0     1  1995-09-01  1995-09-02  1995-10-03  1  1  1     1     0
1     1  1995-09-02  1995-09-02  1995-10-03  1  0  0     0     1
2     1  1995-10-03  1995-09-02  1995-10-03  0  0  0     0     0
3     2  1995-10-04  1995-10-05  1995-11-08  2  2  2     1     0
4     2  1995-10-05  1995-10-05  1995-11-08  2  0  0     0     0
5     2  1995-11-07  1995-10-05  1995-11-08  9  9  9     0     1
6     2  1995-11-08  1995-10-05  1995-11-08  0  0  0     0     0

相关问题 更多 >