擅长:python、mysql、java
<h2>方法1:下部结构,排序并删除重复项</h2>
<p><em>这也适用于许多列</em></p>
<pre><code>subset = ['firstname', 'lastname']
df[subset] = df[subset].apply(lambda x: x.str.lower())
df.sort_values(subset + ['bank'], inplace=True)
df.drop_duplicates(subset, inplace=True)
</code></pre>
^{pr2}$
<hr/>
<h2>方法二:groupby,agg,first</h2>
<p><em>不容易推广到许多列</em></p>
<pre><code>df.groupby([df['firstname'].str.lower(), df['lastname'].str.lower()], sort=False)\
.agg({'email':'first','bank':'first'})\
.reset_index()
</code></pre>
<pre><code> firstname lastname email bank
0 foo bar foo bar Foo bar xyz
1 bar bar bar Bar abc
</code></pre>