<p>我假设您的数据还不正确,因为您的预期输出是可能的,但现在不符合您的逻辑。你知道吗</p>
<p>在<code>second_df</code>中缺少第三个<code>key column</code>,即<code>capacity</code>。如果我们添加这个列并执行<code>left merge</code>,我们就可以实现预期的输出。你知道吗</p>
<p>顺便说一句,我们不需要将列设置为索引,因此解决方案如下所示。你知道吗</p>
<pre><code># Clean up and create correct dataframes
first_df=pd.DataFrame([['2001','Abu Dhabi','100-','462'],
['2001','Abu Dhabi','100','44'],
['2001','Abu Dhabi','200','657'],
['2001','Dubai','100-','40'],
['2001','Dubai','100','30'],
['2001','Dubai','200','51'],
['2002','Abu Dhabi','100-','300'],
['2002','Abu Dhabi','100','220'],
['2002','Abu Dhabi','200','56'],
['2002','Dubai','100-','55'],
['2002','Dubai','100','67'],
['2002','Dubai','200','89']],columns=['Year','Emirate','Capacity','Number'])
second_df=pd.DataFrame([['2001','Abu Dhabi','100-','Performed','45'],
['2001','Abu Dhabi','100','Not Performed','76'],
['2001','Abu Dhabi','','',''],
['2001','Dubai','100-','Performed','90'],
['2001','Dubai','100','Not Performed','50'],
['2001','Dubai','','',''],
['2002','Abu Dhabi','100-','Performed','78'],
['2002','Abu Dhabi','100','Not Performed','45'],
['2002','Abu Dhabi','', '', ''],
['2002','Dubai','100-','Performed','76'],
['2002','Dubai','100','Not Performed','58'],
['2002','Dubai', '', '', '']],columns=['Year','Emirate','Capacity','Type','Value'])
# Perform a left merge to get correct output
merged=first_df.merge(second_df,how='left',on=['Year', 'Emirate', 'Capacity'])
</code></pre>
<p><strong>输出</strong></p>
<pre><code> Year Emirate Capacity Number Type Value
0 2001 Abu Dhabi 100- 462 Performed 45
1 2001 Abu Dhabi 100 44 Not Performed 76
2 2001 Abu Dhabi 200 657 NaN NaN
3 2001 Dubai 100- 40 Performed 90
4 2001 Dubai 100 30 Not Performed 50
5 2001 Dubai 200 51 NaN NaN
6 2002 Abu Dhabi 100- 300 Performed 78
7 2002 Abu Dhabi 100 220 Not Performed 45
8 2002 Abu Dhabi 200 56 NaN NaN
9 2002 Dubai 100- 55 Performed 76
10 2002 Dubai 100 67 Not Performed 58
11 2002 Dubai 200 89 NaN NaN
</code></pre>