Python Pandas比较数据帧元组值问题的回答

Python Pandas比较数据帧元组值

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

你可以这样做： <pre><code>In [1]: df['a'].where( df.apply(lambda row: row['a'][1] > row['b'][1], axis=1), df['b']) Out [1]: 0 (chicken wing, 1) 1 (mason, 0.97) 2 (lost in space, 0.47) 3 (marvelous, 1) Name: a, dtype: object </code></pre> 所以这里我们使用lambda来比较每一行的元组以生成一个布尔掩码，然后将它与<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.where.html#pandas.Series.where" rel="nofollow noreferrer">^{<cd1>}</a>一起使用，如果<code>True</code>则返回列a，否则返回列'b' <code>apply</code>的输出： ^{pr2}$ 更有效的方法是将百分比提取到单独的列中，以便在比较中使用向量化方法： <pre><code>In[4]: df['a_%'] = df['a'].apply(lambda x: x[1]) df['b_%'] = df['b'].apply(lambda x: x[1]) df Out[4]: a b a_% b_% 0 (chicken wing, 1) (saucy, 0.35) 1.00 0.35 1 (burger, 0.85) (mason, 0.97) 0.85 0.97 2 (burping, 0.37) (lost in space, 0.47) 0.37 0.47 3 (marvelous, 1) (tremendous, 0.85) 1.00 0.85 In[5]: df['max_value'] = df['a'].where(df['a_%'] > df['b_%'], df['b']) df Out[5]: a b a_% b_% max_value 0 (chicken wing, 1) (saucy, 0.35) 1.00 0.35 (chicken wing, 1) 1 (burger, 0.85) (mason, 0.97) 0.85 0.97 (mason, 0.97) 2 (burping, 0.37) (lost in space, 0.47) 0.37 0.47 (lost in space, 0.47) 3 (marvelous, 1) (tremendous, 0.85) 1.00 0.85 (marvelous, 1) </code></pre> 您还可以定义一个自定义函数来处理动态数量的col并使用<code>max</code>： <pre><code>In[11]: def func(x): vals = [y[1] for y in x] return x[vals.index(max(vals))] df.apply(lambda row: func(row), axis=1) Out[11]: 0 (chicken wing, 1) 1 (mason, 0.97) 2 (lost in space, 0.47) 3 (marvelous, 1) dtype: object </code></pre>

Python Pandas比较数据帧元组值

1 个回答

相关Python问题