回答此问题可获得 20 贡献值,回答如果被采纳可获得 50 分。
<p>假设我有一个事务和客户的数据框架:</p>
<pre><code>df = pd.DataFrame({'shop': pd.Series(['McDonalds', 'McDonalds', 'McDonalds', 'McDonalds', 'Burger King', 'Burger King', 'Burger King', 'Burger King', 'Burger King', 'Trump Golf Course', 'Trump Golf Course', 'Trump Golf Course', 'Trump Golf Course', 'Trump Golf Course', 'Trump Golf Course'],dtype='object',index=pd.RangeIndex(start=0, stop=15, step=1)), 'Customer': pd.Series(['John Ryan', 'Jim Bob', 'Mary Ryan', 'Michael Patric', 'John Ryan', 'Jim Bob', 'Mary Ryan', 'Sean Connery', 'Brad Pitt', 'John Ryan', 'John Ryan', 'Michael Patric', 'Mary Ryan', 'John Ryan', 'Jim Bob'],dtype='object',index=pd.RangeIndex(start=0, stop=15, step=1)), 'Customer ID': pd.Series([1, 2, 3, 4, 1, 2, 3, 5, 6, 1, 1, 4, 3, 1, 2],dtype='int64',index=pd.RangeIndex(start=0, stop=15, step=1)), 'Amount': pd.Series([50, 32, 15, 65, 32, 51, 54, 84, 52, 51, 2, 32, 54, 87, 65],dtype='int64',index=pd.RangeIndex(start=0, stop=15, step=1))}, index=pd.RangeIndex(start=0, stop=15, step=1))
print(df)
shop Customer Customer ID Amount
0 McDonalds John Ryan 1 50
1 McDonalds Jim Bob 2 32
2 McDonalds Mary Ryan 3 15
3 McDonalds Michael Patric 4 65
4 Burger King John Ryan 1 32
5 Burger King Jim Bob 2 51
6 Burger King Mary Ryan 3 54
7 Burger King Sean Connery 5 84
8 Burger King Brad Pitt 6 52
9 Trump Golf Course John Ryan 1 51
10 Trump Golf Course John Ryan 1 2
11 Trump Golf Course Michael Patric 4 32
12 Trump Golf Course Mary Ryan 3 54
13 Trump Golf Course John Ryan 1 87
14 Trump Golf Course Jim Bob 2 65
</code></pre>
<p>我想提取或标记那些没有在麦当劳购物的汉堡王顾客。(在本例中,肖恩·康纳利和布拉德·皮特)</p>
<p>我试图创建一个掩码,其中<code>shop == McDonalds</code>,并获取客户ID</p>
<pre><code>mask1 = df.shop == 'McDonalds'
mcdonalds_customer_ids = df[mask1]['Customer ID'].values
array([1, 2, 3, 4], dtype=int64)
</code></pre>
<p>然后创建一个单独的掩码,其中<code>shop=='Burger King'</code>和客户ID不在麦当劳客户ID列表中:</p>
<pre><code>mask = (df['shop'] == 'Burger King' & df['Customer ID'] not in mcdonalds_customer_ids)
</code></pre>
<p>我得到以下错误:</p>
<pre><code>TypeError: ufunc 'bitwise_and' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
TypeError: cannot compare a dtyped [int64] array with a scalar of type [bool]
</code></pre>
<p>我也尝试过使用<code>np.where</code>,但它变得更加混乱。你知道吗</p>
<p>我的预期产出只是提取两个没有在麦当劳购物的汉堡王顾客:</p>
<pre><code> shop Customer Customer ID Amount
7 Burger King Sean Connery 5 84
8 Burger King Brad Pitt 6 52
</code></pre>
<p>或者用np.哪里地址:</p>
<pre><code> shop Customer Customer ID Amount No_McDonalds
7 Burger King Sean Connery 5 84 True
8 Burger King Brad Pitt 6 52 True
</code></pre>
<p>我可以用一个函数来实现这一点,但我希望能以某种方式将它矢量化。完全失败,感谢任何帮助。你知道吗</p>