<p><strong>这是用可运行代码编辑的新答案</strong></p>
<p>即使行数不相等,下面的代码也可以工作。它将首先获取两个数据帧上的公共行,然后为所需列找到正确的值</p>
<pre class="lang-py prettyprint-override"><code>import numpy as np
import pandas as pd
## creating dummy data to get runable code
## -
n_rows = 20
sub_categories = np.random.choice(4, size=n_rows)
dic1 = {
"a": list(range(n_rows)),
"b": sub_categories,
"c": np.random.randn(n_rows)
}
dic2 = {
"a": range(n_rows),
"b": sub_categories,
"c": np.random.randn(n_rows)
}
df1 = pd.DataFrame(dic1)
df1.drop(index=list(np.random.choice(n_rows, 5, replace=False)), inplace=True)
df2 = pd.DataFrame(dic2)
df2.drop(index=list(np.random.choice(n_rows, 3, replace=False)), inplace=True)
## Main Answer
## -
## merge df1 and df2 then create new column c based which take min(abs(c_1, c_2))
result = df1.merge(df2, how="inner", on=["a","b"], suffixes=["_1", "_2"]).copy()
result["c"] = result["c_1"].where(np.abs(result["c_1"])<np.abs(result["c_2"]),
result["c_2"], axis=0)
display(result)
## finally reindex to remove extra columns
result = result.reindex(columns=["a","b","c"])
result
</code></pre>
<p><strong>旧答案</p>
<p>你可以这样做</p>
<pre class="lang-py prettyprint-override"><code>series = df1["return"].where(np.abs(df1["return"])<np.abs(df2["return"]), df2["return"], axis=0)
series
</code></pre>
<p>如果返回值的绝对值小于df2中的同一行,则它将返回一个从df1中取值的序列,否则它将从df2中取值</p>
<p>然后您可以替换df1或df2的列或它们的副本,以获得所需的数据帧</p>
<pre><code>df1["return"] = series
</code></pre>