<p>所以这里可能有很多方法可以解决这个问题。我将使用<a href="https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.transform.html" rel="nofollow noreferrer">^{<cd1>}</a>和<a href="https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.groupby.html" rel="nofollow noreferrer">^{<cd2>}</a>。你知道吗</p>
<p>让我们首先过滤数据帧,以获得在3个以上系统中的用户。既然你说不会有重复,我们可以简单地使用计数!你知道吗</p>
<pre><code>more_than_3 = df1[df1.groupby('email')['email'].transform('count') > 3].sort_values(['email', 'System'])
# sort values is just making the output more readable and put everything in order.
# output below
employee_number email System
2 10441 doug.wever@test.com System1
4 14012 doug.wever@test.com System2
7 82189 doug.wever@test.com System3
10 87165 doug.wever@test.com System4
12 88165 doug.wever@test.com System5
</code></pre>
<p>然后我们简单地把其他人的逻辑颠倒过来:</p>
<pre><code>others = df1[df1.groupby('email')['email'].transform('count') <= 3].sort_values(['email', 'System'])
# output
employee_number email System
14 87944 John.taver@test.com System5
3 12374 Rich.flipt@test.com System2
1 8304 bill.riley@test.com System1
13 87944 jared.Rich@test.com System5
11 87844 jose.taver@test.com System4
8 86099 krish.ragg@test.com System3
0 807 marg.prent@test.com System1
5 15906 marg.prent@test.com System2
9 86646 marg.prent@test.com System4
6 16223 mark.johns@test.com System3
</code></pre>
<p>要将这些数据帧发送到excel,可以使用<a href="https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_excel.html" rel="nofollow noreferrer">^{<cd3>}</a>。此外,如果在同一工作簿中需要它们,请使用<code>sheetname</code>参数。你知道吗</p>