擅长:python、mysql、java
<p>这里有一个方法可以避免多次合并每个数据帧,方法是将原始数据帧的多个<code>id*</code>列堆叠成一个<code>id</code>列,然后根据该列合并每个数据帧一次。我不能保证这会比更直接的方法更快地处理你的数据(但如果不是的话,请告诉我)。在</p>
<pre><code>import numpy as np
# Set some initial arguments (you might do this programmatically instead)
id_cols = ['id1', 'id2']
df_list = [df_1, df_2]
q_list = ['q_{0}'.format(n + 1) for n in range(len(df_list))]
# Make a new df stacking all the id columns
s = df_id[id_cols].stack()
s.name = 'id'
df = pd.DataFrame(s).reset_index()
# Merge each dataframe on the id column once
for n, df_n in enumerate(df_list):
df_n.rename(columns={'q': 'q_{0}'.format(n + 1)}, inplace=True)
df = df.merge(df_n, left_on='id', right_on='uid{0}'.format(n + 1), how='left')
del df['uid{0}'.format(n + 1)]
# If there are multiple values that match, reconcile them
df = df.set_index(['level_0', 'level_1']).unstack(level=-1)
df = df.loc[:, q_list].groupby(level=0, axis=1).max(axis=1).replace({None: np.nan})
# Re-merge with the original dataframe
df_id.merge(df, left_index=True, right_index=True)
</code></pre>
<p>结果如下:</p>
^{pr2}$