python panda在两个轴上连接数据帧（在所有方向合并）

import pandas as pd df1 = pd.DataFrame({'GMT_date': ["01-01-2000","02-01-2000","03-01-2000","04-01-2000","05-01-2000"] , 'LOWI_Precipitation_24_Hour_(mm)': [2.2, 0, 0.2, 0.4, 0]}) df2 = pd.DataFrame({'GMT_date': ["06-01-2000","07-01-2000","08-01-2000","09-01-2000","10-01-2000"] , 'LOWI_Precipitation_24_Hour_(mm)': [0.1, 0.1, 3.1, 4.0, 0]}) df3 = pd.DataFrame({'GMT_date': ["06-01-2000","07-01-2000","08-01-2000","09-01-2000","10-01-2000"] , 'TGPY_Precipitation_24_Hour_(mm)': [0.6, 0.7, 5.1, 1.0, 2.0]}) df_list = [df1, df2, df3] merged_df = pd.DataFrame() for df in df_list: merged_df = merged_df.merge(df, how='outer', left_index=True, right_index=True) print(merged_df) GMT_date_x LOWI_Precipitation_24_Hour_(mm)_x GMT_date_y LOWI_Precipitation_24_Hour_(mm)_y GMT_date TGPY_Precipitation_24_Hour_(mm) 0 01-01-2000 2.2 06-01-2000 0.1 06-01-2000 0.6 1 02-01-2000 0.0 07-01-2000 0.1 07-01-2000 0.7 2 03-01-2000 0.2 08-01-2000 3.1 08-01-2000 5.1 3 04-01-2000 0.4 09-01-2000 4.0 09-01-2000 1.0 4 05-01-2000 0.0 10-01-2000 0.0 10-01-2000 2.0

GMT_date LOWI_Precipitation_24_Hour_(mm) TGPY_Precipitation_24_Hour_(mm) 0 01-01-2000 2.2 NaN 1 02-01-2000 0.0 NaN 2 03-01-2000 0.2 NaN 3 04-01-2000 0.4 NaN 4 05-01-2000 0.0 NaN 0 06-01-2000 0.1 NaN 1 07-01-2000 0.1 NaN 2 08-01-2000 3.1 NaN 3 09-01-2000 4.0 NaN 4 10-01-2000 0.0 NaN 0 06-01-2000 NaN 0.6 1 07-01-2000 NaN 0.7 2 08-01-2000 NaN 5.1 3 09-01-2000 NaN 1.0 4 10-01-2000 NaN 2.0

for df in df_list: df = df.set_index('GMT_date') merged_df = pd.concat([merged_df, df], axis=1) print(merged_df) LOWI_Precipitation_24_Hour_(mm) LOWI_Precipitation_24_Hour_(mm) TGPY_Precipitation_24_Hour_(mm) 01-01-2000 2.2 NaN NaN 02-01-2000 0.0 NaN NaN 03-01-2000 0.2 NaN NaN 04-01-2000 0.4 NaN NaN 05-01-2000 0.0 NaN NaN 06-01-2000 NaN 0.1 0.6 07-01-2000 NaN 0.1 0.7 08-01-2000 NaN 3.1 5.1 09-01-2000 NaN 4.0 1.0 10-01-2000 NaN 0.0 2.0

2条回答

网友

1楼 · 编辑于 2024-10-01 05:00:17

第一个concat具有LOWI_Precipitation_24_Hour_和 concat具有TGPY_Precipitation_24_Hour_(mm)的所有df，然后使用merge

df1 = pd.DataFrame({'GMT_date': ["01-01-2000", "02-01-2000", "03-01-2000", "04-01-2000", "05-01-2000"],
                    'LOWI_Precipitation_24_Hour_(mm)': [2.2, 0, 0.2, 0.4, 0]})

df2 = pd.DataFrame({'GMT_date': ["06-01-2000", "07-01-2000", "08-01-2000", "09-01-2000", "10-01-2000"],
                    'LOWI_Precipitation_24_Hour_(mm)': [0.1, 0.1, 3.1, 4.0, 0]})

df3 = pd.DataFrame({'GMT_date': ["06-01-2000", "07-01-2000", "08-01-2000", "09-01-2000", "10-01-2000"],
                    'TGPY_Precipitation_24_Hour_(mm)': [0.6, 0.7, 5.1, 1.0, 2.0]})
# merge_df = pd.DataFrame()

new_df = pd.concat([df1, df2])
print(new_df)
merge_df = pd.merge(new_df, df3, how='left')
print(merge_df)

输出：

^{pr2}$

例如你有这种测向仪

df1 = pd.DataFrame({'GMT_date': ["01-01-2000", "02-01-2000", "03-01-2000", "04-01-2000", "05-01-2000"],
                    'LOWI_Precipitation_24_Hour_(mm)': [2.2, 0, 0.2, 0.4, 0]})

df2 = pd.DataFrame({'GMT_date': ["06-01-2000", "07-01-2000", "08-01-2000", "09-01-2000", "10-01-2000"],
                    'LOWI_Precipitation_24_Hour_(mm)': [0.1, 0.1, 3.1, 4.0, 0]})

df3 = pd.DataFrame({'GMT_date': ["06-01-2000", "07-01-2000", "08-01-2000", "09-01-2000", "10-01-2000"],
                    'TGPY_Precipitation_24_Hour_(mm)': [0.6, 0.7, 5.1, 1.0, 2.0]})

df4 = pd.DataFrame({'GMT_date': ["01-01-2000", "02-01-2000", "03-01-2000", "04-01-2000", "05-01-2000"],
                    'TGPY_Precipitation_24_Hour_(mm)': [0.6, 0.7, 5.1, 1.0, 2.0]})
# merge_df = pd.DataFrame()
# df1.set_index('GMT_date')

new_df = pd.concat([df1, df2])
new_df2 = pd.concat([df3, df4])

print(new_df)
merge_df = pd.merge(new_df, new_df2, how='left')
print(merge_df)

输出：

     GMT_date  LOWI_Precipitation_24_Hour_(mm)  TGPY_Precipitation_24_Hour_(mm)
0  01-01-2000                              2.2                              0.6
1  02-01-2000                              0.0                              0.7
2  03-01-2000                              0.2                              5.1
3  04-01-2000                              0.4                              1.0
4  05-01-2000                              0.0                              2.0
5  06-01-2000                              0.1                              0.6
6  07-01-2000                              0.1                              0.7
7  08-01-2000                              3.1                              5.1
8  09-01-2000                              4.0                              1.0
9  10-01-2000                              0.0                              2.0

网友

2楼 · 编辑于 2024-10-01 05:00:17

我有另一个解决方案，在连接所有数据帧后，使用groupby合并重复列

import pandas as pd
import numpy as np
df1 = pd.DataFrame({'GMT_date': ["01-01-2000", "02-01-2000", "03-01-2000", "04-01-2000", "05-01-2000"],
                    'LOWI_Precipitation_24_Hour_(mm)': [2.2, 0, 0.2, 0.4, 0]})

df2 = pd.DataFrame({'GMT_date': ["06-01-2000", "07-01-2000", "08-01-2000", "09-01-2000", "10-01-2000"],
                    'LOWI_Precipitation_24_Hour_(mm)': [0.1, 0.1, 3.1, 4.0, 0]})

df3 = pd.DataFrame({'GMT_date': ["06-01-2000", "07-01-2000", "08-01-2000", "09-01-2000", "10-01-2000"],
                    'TGPY_Precipitation_24_Hour_(mm)': [0.6, 0.7, 5.1, 1.0, 2.0]})

df_list = [df1, df2, df3]
merged_df = pd.DataFrame()

for df in df_list:
    df = df.set_index('GMT_date')

    merged_df = pd.concat([merged_df, df], axis=1)

merged_df = merged_df.groupby(level=0, axis=1).agg(np.max)

print(merged_df)

输出：

^{pr2}$

你可以用

merged_df = merged_df.groupby(level=0, axis=1).agg(np.max if np.nan else sum) # if there is `Nan` then it will take `np.max` else take sum of values

相关问题更多 >

编程相关推荐

热门问题

热门文章