擅长:python、mysql、java
<p>我编写了一个函数并将其应用于df。它通常比普通的循环快一点。在</p>
<pre><code>import pandas as pd
import numpy as np
def vote(row):
pos = np.sum(row.values == 'Positive')
neg = np.sum(row.values == 'Negative')
if pos > neg:
return('Positive')
elif pos < neg:
return('Negative')
else:
return(row['Vote1'])
# Create the dataframe
df = pd.DataFrame()
df['id']=[123,223,323,423]
df['Vote1']=['Positive']*4
df['Vote2']=['Negative']*3+['Positive']
df['Vote3']=['Positive','Neutral','Negative','']
df = df.set_index('id')
df['Winner'] = df.apply(vote,axis=1)
</code></pre>
<p>结果</p>
^{pr2}$