<pre><code>import numpy as np
import pandas as pd
data = [(27450, 27450, 29420,"10/10/2016"),
(29420 , 36142, 29420, "10/10/2016"),
(11 , 11, 27450, "10/10/2016")]
df = pd.DataFrame(data, columns=("User_id","Actor1","Actor2", "Time"))
mask = (df['User_id'] == df['Actor1'])
df['first actor'] = mask.astype(int)
df['other actor'] = np.where(mask, df['Actor2'], df['Actor1'])
print(df)
</code></pre>
<p>收益率</p>
^{pr2}$
<hr/>
<p>首先创建一个布尔掩码,当<code>User_id</code>等于<code>Actor1</code>时为真:</p>
<pre><code>In [51]: mask = (df['User_id'] == df['Actor1']); mask
Out[51]:
0 True
1 False
2 True
dtype: bool
</code></pre>
<p>将<code>mask</code>转换为int将创建第一列:</p>
<pre><code>In [52]: mask.astype(int)
Out[52]:
0 1
1 0
2 1
dtype: int64
</code></pre>
<p>然后使用<code>np.where</code>在两个值之间进行选择。<code>np.where(mask, A, B)</code>返回一个数组,如果<code>mask[i]</code>为True,则返回其<code>ith</code>值为<code>A[i]</code>,否则返回{<cd9>}。因此,
<code>np.where(mask, df['Actor2'], df['Actor1'])</code>取<code>Actor2</code>的值,其中<code>mask</code>为真,则取{<cd2>}中的值,否则:</p>
<pre><code>In [53]: np.where(mask, df['Actor2'], df['Actor1'])
Out[53]: array([29420, 36142, 27450])
</code></pre>