擅长:python、mysql、java
<p>不如使用<code>concat</code>、<code>groupby</code>、<code>agg</code>,然后编写一个agg函数来选择正确的值:</p>
<pre><code>import pandas as pd
import io
t1 = """store_id,address,phone
9191,9827 Park st,999999999
8181,543 Hello st,1111111111"""
t2 = """store_id,address,phone
9191,9827 Park st Apt82,999999999
7171,912 John st,87282728282"""
df1 = pd.read_csv(io.BytesIO(t1))
df2 = pd.read_csv(io.BytesIO(t2))
df = pd.concat([df1, df2]).reset_index(drop=True)
def f(s):
loc = s.str.len().idxmax()
return s[loc]
df.groupby(["store_id", "phone"]).agg(f)
</code></pre>