我有一个详细的Transaction data
数据框架,如下所示
df_col = pd.DataFrame({'SQ':[1,1,2],
'City':['A','A','B'],
'Date':['7-1-2020','7-2-2020','7-1-2020'],
'Loc 1':[40,21,27],
'Loc 2':[37,40,14],
'Loc 3':[49,38,36],
'Loc 4':[20,14,18],
'Loc 5':[48,27,36]})
+----+------+----------+-------+-------+-------+-------+-------+
| SQ | City | Date | Loc 1 | Loc 2 | Loc 3 | Loc 4 | Loc 5 |
+----+------+----------+-------+-------+-------+-------+-------+
| 1 | A | 7-1-2020 | 40 | 37 | 49 | 20 | 48 |
+----+------+----------+-------+-------+-------+-------+-------+
| 1 | A | 7-2-2020 | 21 | 40 | 38 | 14 | 27 |
+----+------+----------+-------+-------+-------+-------+-------+
| 2 | B | 7-1-2020 | 27 | 14 | 36 | 18 | 36 |
+----+------+----------+-------+-------+-------+-------+-------+
另外,我还有一个单独的Location data
数据帧,如下所示
+------------+--------------+
| LocationNo | LocationType |
+------------+--------------+
| Loc 1 | Class A |
+------------+--------------+
| Loc 2 | Class A |
+------------+--------------+
| Loc 3 | Class B |
+------------+--------------+
| Loc 4 | Class C |
+------------+--------------+
| Loc 5 | Class C |
+------------+--------------+
df_row = pd.DataFrame({'LocationNo':['Loc 1','Loc 2','Loc 3','Loc 4','Loc 5'],
'LocationType':['Class A', 'Class A', 'Class B', 'Class C', 'Class C']
})
现在,我的任务是将df_col
中的merge
列df_col
到df_row
中的行,并求和这些值。
合并Vertical to Horizontal
,即Columns to Rows
我想要的输出如下
+----+------+----------+---------+---------+---------+
| SQ | City | Date | Class A | Class B | Class C |
+----+------+----------+---------+---------+---------+
| 1 | A | 7-1-2020 | 77 | 49 | 68 |
+----+------+----------+---------+---------+---------+
| 1 | A | 7-2-2020 | 61 | 38 | 41 |
+----+------+----------+---------+---------+---------+
| 2 | B | 7-1-2020 | 41 | 36 | 54 |
+----+------+----------+---------+---------+---------+
我写了下面的代码
# setting the index
df_col.set_index(['SQ','City','Date'], inplace=True)
df_row.set_index('LocationNo', inplace=True)
# I tried to merge vertically columns to columns. Hence, transpose the df_col.T
df_final = df_col.T.merge(df_row, left_index=True, right_index=True, how='left').groupby('LocationType').agg('sum').T
上述代码输出的结果如下所示
+------------------+-----------+-----------+-----------+
| LocationType | Class A | Class B | Class C |
+------------------+-----------+-----------+-----------+
| (1, A, 7-1-2020) | 77 | 49 | 68 |
+------------------+-----------+-----------+-----------+
| (1, A, 7-1-2020) | 61 | 38 | 41 |
+------------------+-----------+-----------+-----------+
| (2, B, 7-2-2020) | 41 | 36 | 54 |
+------------------+-----------+-----------+-----------+
答案是正确的。但是,前3列合并为一列。我需要将其分成类似于上面提到的desired output
的单独列
我应该如何解决这个问题?什么是有效的方法
让我们从{}中{}类似{}的列,然后基于{}中的{}将这些列映射到{},最后使用{}在这些映射列上沿{}和{}映射数据帧{}:
使用
melt
、merge
和groupby
的一种方法:相关问题 更多 >
编程相关推荐