根据条件生成列

>>> df vn pt st nst stb mid 0 a 0.1 a b 0 3 1 a 0.2 a b 4 3 2 a 0.3 a b 1 3 3 a 0.3 b a 1 3 4 a 0.4 a b 1 3 5 a 0.4 a b 2 3 6 a 0.5 c b 6 3 7 a 0.5 c b 0 3 8 a 0.6 c b 1 3 9 a 1.1 b c 2 3 10 a 1.2 b c 1 3 11 a 1.3 d b 6 3 12 a 1.4 d b 0 3 13 a 1.4 d b 1 3 14 a 1.5 e d 2 3 15 a 1.6 d e 0 3 16 a 0.1 d y 1 7 17 a 0.2 y d 4 7 18 a 0.3 y d 1 7 19 a 0.4 y x 3 7 20 a 0.5 x z 0 7 21 a 0.6 p z 2 7 22 a 0.6 z p 6 7 23 a 1.1 p q 3 7

st nst stb sr nsr a b 0 0+0=0(sr=sr+stb) 0(nst newly enrolled, set to 0) a b 4 0+4=4(sr=sr+stb) 0(remains same) a b 1 4+1=5(sr=sr+stb) 0(remains same) b a 1 0+1=1(sr=nsr+stb),bcz b moves from nst to st 5(shifts from sr to nsr) a b 1 5+1=6(sr=nsr+stb),bcz a moves from nst to st 1(shifts from sr to nsr) a b 2 6+2=8(sr=sr+stb) 1(remains same) c b 6 0+6=6(sr=sr+stb),c newly inserted 1(remains same) ........... (will continue recursively until `mid` is unique) ...........

vn pt st sr nsr 0 a 0.1 a 0 0 1 a 0.2 a 4 0 2 a 0.3 a 5 0 3 a 0.3 b 1 5 4 a 0.4 a 6 1 5 a 0.4 a 8 1 6 a 0.5 c 6 1 7 a 0.5 c 6 1 8 a 0.6 c 7 1 9 a 1.1 b 3 7 10 a 1.2 b 4 7 11 a 1.3 d 6 4 12 a 1.4 d 6 4 13 a 1.4 d 7 4 14 a 1.5 e 2 7 15 a 1.6 d 7 2 16 a 0.1 d 1 0 17 a 0.2 y 4 1 18 a 0.3 y 5 1 19 a 0.4 y 8 0 20 a 0.5 x 0 0 21 a 0.6 p 2 0 22 a 0.6 z 6 2 23 a 1.1 p 5 0

2条回答

网友

1楼 · 编辑于 2024-05-20 10:10:20

（部分尝试等待反馈）− 不符合评论。）

根据您的解释，sr是每个st，nst对的stb的不同累计和。但是，这并不完全符合您的预期输出：

>>> df['sr'] = df.groupby(['nst', 'st'])['stb'].cumsum()
>>> df[['sr']].join([expected['sr'].rename('expected'), (df['sr'] - expected['sr']).rename('diff')])
    sr  expected  diff
0    0         0     0
1    4         4     0
2    5         5     0
3    1         1     0
4    6         6     0
5    8         8     0
6    6         6     0
7    6         6     0
8    7         7     0
9    2         3    -1
10   3         4    -1
11   6         6     0
12   6         6     0
13   7         7     0
14   2         2     0
15   0         7    -7
16   1         1     0
17   4         4     0
18   5         5     0
19   3         8    -5
20   0         0     0
21   2         2     0
22   6         6     0
23   3         5    -2

第9、10、15、19和23行发生了什么

例如，第9行是第一个带有b, c的行，如果我将其与第3行比较，第一个带有b, a的行应该是0+3，就像第3行是0+1

网友

2楼 · 编辑于 2024-05-20 10:10:20

根据评论中的问题和讨论，以下是迄今为止的部分解决方案：

sr列已获得预期结果，但nsr需要进一步的工作：

df['sr'] = df.groupby(['mid', 'st'])['stb'].cumsum()

结果：

print(df)

   vn   pt st nst  stb  mid  sr
0   a  0.1  a   b    0    3   0
1   a  0.2  a   b    4    3   4
2   a  0.3  a   b    1    3   5
3   a  0.3  b   a    1    3   1
4   a  0.4  a   b    1    3   6
5   a  0.4  a   b    2    3   8
6   a  0.5  c   b    6    3   6
7   a  0.5  c   b    0    3   6
8   a  0.6  c   b    1    3   7
9   a  1.1  b   c    2    3   3
10  a  1.2  b   c    1    3   4
11  a  1.3  d   b    6    3   6
12  a  1.4  d   b    0    3   6
13  a  1.4  d   b    1    3   7
14  a  1.5  e   d    2    3   2
15  a  1.6  d   e    0    3   7
16  a  0.1  d   y    1    7   1
17  a  0.2  y   d    4    7   4
18  a  0.3  y   d    1    7   5
19  a  0.4  y   x    3    7   8
20  a  0.5  x   z    0    7   0
21  a  0.6  p   z    2    7   2
22  a  0.6  z   p    6    7   6
23  a  1.1  p   q    3    7   5

为nsr进行的部分工作：

m1 = df['st'].ne(df['st'].groupby(df['mid']).shift())
m2 = df['st'].eq(df['nst'].shift())
m3 = df['nst'].eq(df['st'].shift())
m = m1 & (m2 | m3)

df['nsr'] = np.where(m, df['sr'].shift(), np.nan)

m11 = df['mid'] != df['mid'].shift()
df['nsr'] = np.where(m11, 0, df['nsr'])

df['nsr'] = df['nsr'].ffill(downcast='infer')

结果：

print(df)

   vn   pt st nst  stb  mid  sr  nsr
0   a  0.1  a   b    0    3   0    0
1   a  0.2  a   b    4    3   4    0
2   a  0.3  a   b    1    3   5    0
3   a  0.3  b   a    1    3   1    5
4   a  0.4  a   b    1    3   6    1
5   a  0.4  a   b    2    3   8    1
6   a  0.5  c   b    6    3   6    1
7   a  0.5  c   b    0    3   6    1
8   a  0.6  c   b    1    3   7    1
9   a  1.1  b   c    2    3   3    7
10  a  1.2  b   c    1    3   4    7
11  a  1.3  d   b    6    3   6    4
12  a  1.4  d   b    0    3   6    4
13  a  1.4  d   b    1    3   7    4
14  a  1.5  e   d    2    3   2    7
15  a  1.6  d   e    0    3   7    2
16  a  0.1  d   y    1    7   1    0
17  a  0.2  y   d    4    7   4    1
18  a  0.3  y   d    1    7   5    1
19  a  0.4  y   x    3    7   8    1
20  a  0.5  x   z    0    7   0    8
21  a  0.6  p   z    2    7   2    8
22  a  0.6  z   p    6    7   6    2
23  a  1.1  p   q    3    7   5    6

编辑

这里是另一次尝试，以完成上次留下的部分作品

通过添加一组新的处理，最终实现了nsr的期望值

m1 = df['st'].ne(df['st'].groupby(df['mid']).shift())
m2 = df['st'].eq(df['nst'].shift())
m3 = df['nst'].eq(df['st'].shift())
m = m1 & (m2 | m3)

df['nsr'] = np.where(m, df['sr'].shift(), np.nan)

## Handle the condition with a new value of `nst` is seen AND
## at the same time, it is NOT shifted from `st`:
# start of new codes
m21 = df['nst'] != df['nst'].shift()
m22 = df['nst'] != df['st'].shift()
df['nsr'] = np.where(m21 & m22, 0, df['nsr'])
# end of new codes

m11 = df['mid'] != df['mid'].shift()
df['nsr'] = np.where(m11, 0, df['nsr'])

df['nsr'] = df['nsr'].ffill(downcast='infer')

结果：

print(df)

   vn   pt st nst  stb  mid  sr  nsr
0   a  0.1  a   b    0    3   0    0
1   a  0.2  a   b    4    3   4    0
2   a  0.3  a   b    1    3   5    0
3   a  0.3  b   a    1    3   1    5
4   a  0.4  a   b    1    3   6    1
5   a  0.4  a   b    2    3   8    1
6   a  0.5  c   b    6    3   6    1
7   a  0.5  c   b    0    3   6    1
8   a  0.6  c   b    1    3   7    1
9   a  1.1  b   c    2    3   3    7
10  a  1.2  b   c    1    3   4    7
11  a  1.3  d   b    6    3   6    4
12  a  1.4  d   b    0    3   6    4
13  a  1.4  d   b    1    3   7    4
14  a  1.5  e   d    2    3   2    7
15  a  1.6  d   e    0    3   7    2
16  a  0.1  d   y    1    7   1    0
17  a  0.2  y   d    4    7   4    1
18  a  0.3  y   d    1    7   5    1
19  a  0.4  y   x    3    7   8    0
20  a  0.5  x   z    0    7   0    0
21  a  0.6  p   z    2    7   2    0
22  a  0.6  z   p    6    7   6    2
23  a  1.1  p   q    3    7   5    0

编辑

相关问题更多 >

编程相关推荐

热门问题

热门文章