擅长:python、mysql、java
<p>在<code>df1</code>和<code>df2</code>上都有相同的副本,因此合并的<code>df</code>的行数是每个副本的两倍。简单的解决方案是通过<code>drop_duplicates</code>和<code>merge</code>保持一个数据帧的唯一性</p>
<pre><code>df = pd.merge(df1.drop_duplicates(), df2, left_on=['ColumnA','ColumnB' ,'ColumnC','ColumnD'], right_on=['ColumnE','ColumnF','ColumnG','ColumnH'], how='outer')
Out[742]:
ColumnA ColumnB ColumnC ColumnD ColumnE ColumnF ColumnG ColumnH
0 1 2 3 4 1 2 3 4
1 1 2 3 4 1 2 3 4
</code></pre>