c = ['Subject', 'Verb', 'Object']
def f(x):
return x[c].duplicated() & x.Date.diff().dt.days.lt(5)
df = df.sort_values(c)
df[~df.groupby(c).apply(f).values]
Subject Verb Object Date
0 Bill Ate Food 2015-07-11
1 Steve Painted House 2011-08-12
3 Steve Painted House 2011-08-25
将
duplicated
+diff
与groupby
一起使用,以确定要删除的行。在相关问题 更多 >
编程相关推荐