删除在1列中总和为零但在其他方面在1列中重复的行

df = pd.DataFrame({'ID':['A001', 'A001', 'A001', 'A002', 'A002', 'A003', 'A003', 'A004', 'A004', 'A004', 'A005', 'A005'], 'Val1':[2, 2, 2, 5, 6, 8, 8, 3, 3, 3, 7, 7], 'Val2':[100, -100, 50, -40, 40, 60, -50, 10, -10, 10, 15, 15]})

ID Val1 Val2 0 A001 2 100 1 A001 2 -100 2 A001 2 50 3 A002 5 -40 4 A002 6 40 5 A003 8 60 6 A003 8 -50 7 A004 3 10 8 A004 3 -10 9 A004 3 10 10 A005 7 15 11 A005 7 15

3条回答

网友

1楼 · 编辑于 2024-07-08 18:11:41

我在代码中添加了一些注释，因此希望我的思路应该是明确的：

cond = df.assign(temp=df.Val2.abs())
# a way to get the same values (differentiated by their sign)
# to follow each other
cond = cond.sort_values(["ID", "Val1", "temp"])

# cumsum should yield a zero for numbers that are different
# only by their sign
cond["check"] = cond.groupby(["ID", "temp"]).Val2.cumsum()
cond["check"] = np.where(cond.check != 0, np.nan, cond.check)

# the backward fill here allows us to assign an identifier
# to the two values that summed to zero
cond["check"] = cond["check"].bfill(limit=1)

# this is where we implement your other condition
# essentially, it looks for rows that are duplicates
# and rows that any two rows sum to zero
cond.loc[
    ~(cond.duplicated(["ID", "Val1"], keep=False) & (cond.check == 0)),
    ["ID", "Val1", "Val2"],
]



     ID Val1    Val2
2   A001    2   50
3   A002    5   -40
4   A002    6   40
6   A003    8   -50
5   A003    8   60
9   A004    3   10

网友

2楼 · 编辑于 2024-07-08 18:11:41

那么：

temp = df.groupby('ID')[['Val2']].rolling(2).sum()
ix = temp[temp.Val2==0].index
ar = np.array([x[1] for x in ix.values])
ix2 = ar.tolist() + (ar-1).tolist()
df.drop(ix2, inplace=True)
df.drop_duplicates(['ID', 'Val1'], keep='first', inplace=True)

但这个答案指的是你的“文本”答案：第8行&；9‘Val2’实际上等于零（这不是您发布的“期望输出”

网友

3楼 · 编辑于 2024-07-08 18:11:41

使用groupby和cumsum查找Val2的哪个索引和为零

s = df.groupby(['ID', 'Val1']).Val2.cumsum() == 0

n = np.where(s==1)[0]

to_remove = np.concatenate((n, (n-1))) 

new_df = df[~df.index.isin(to_remove)]

new_df 

      ID  Val1  Val2
2   A001     2    50
3   A002     5   -40
4   A002     6    40
5   A003     8    60
6   A003     8   -50
9   A004     3    10
10  A005     7    15
11  A005     7    15

相关问题更多 >

编程相关推荐

热门问题

热门文章