Pandas使用混合数据类型从不同的数据帧中减去列

def subs_PM(pm10, pm25): boolpm10=isinstance(pm10, (int, long, float, complex)) and not isinstance(pm10, bool) boolpm25=isinstance(pm10, (int, long, float, complex)) and not isinstance(pm25, bool) if boolpm10 and boolpm25: return pm10-pm25 else: return ''

df1 = pd.DataFrame({1: range(10)}) df2 = pd.DataFrame({1: [2, 3, '', '', 2, 1, '', 6, 2, 3]}) df1.combine(df2, func=subs_PM) df1.combine(df2, func=subs_PM_old) list(map(subs_PM, df1, df2)) list(map(subs_PM_old, df1, df2))

2条回答

网友

1楼 · 编辑于 2024-05-19 10:29:52

试试这个：

def subs_PM(pm10, pm25):
    #pm10 and pm25 are series... not a single number
    #print(pm10)
    try:
        pm10=pd.to_numeric(pm10)
        pm25=pd.to_numeric(pm25)
        return pm10-pm25
    except:
        return None

df1 = pd.DataFrame({1: range(10)})
df2 = pd.DataFrame({1: [2, 3, '', '', 2, 1, '', 6, 2, 3]})
df1.combine(df2, func=subs_PM)

网友

2楼 · 编辑于 2024-05-19 10:29:52

为了检查所有变体，我定义了如下源数据帧：

df1 = pd.DataFrame({1: [0, '',  2,  3, 4, 5, '', 7, 8, 9]})
df2 = pd.DataFrame({1: [2,  3, '', '', 2, 1,  5, 6, 2, 3]})

目标是要有“对”的参数，其中要么df1要么 df2可以包含一个字符串（从最终结果中排除）

初始操作包括：

连接两个数据帧
将空字符串替换为NaN并删除它们
将类型改回int
为两列指定不同的名称

执行此操作的代码是：

res = df1.join(df2, rsuffix='_2').replace('', np.nan).dropna().astype(int)
res.columns=['c1', 'c2']

对于我的源数据，结果是：

然后计算差值，将其保存在另一列中：

res['dif'] = res.c1 - res.c2

最终结果是：

   c1  c2  dif
0   0   2   -2
4   4   2    2
5   5   1    4
7   7   6    1
8   8   2    6
9   9   3    6

如果需要，请删除c1和c2列

相关问题更多 >

编程相关推荐

热门问题

热门文章