数据帧:
df = {'Client': ['A', 'A', 'A', 'B', 'B', 'B', 'B','B','B','B','C','D','D','D','D','D','D','D','D','D','D','D' ],
'Result': ['Covered', 'Customer Reject', 'Customer Timeout', 'Dealer Reject','Dealer Timeout','Done','Tied Covered','Tied Done','Tied Traded Away','Traded Away','No RFQ','Covered','Customer Reject','Customer Timeout','Dealer Reject','Dealer Timeout','Done','Tied Covered','Tied Done','Tied Traded Away','Traded Away','No RFQ']}
df = pd.DataFrame.from_dict(df)
print(df)
Client Result
0 A Covered
1 A Customer Reject
2 A Customer Timeout
3 B Dealer Reject
4 B Dealer Timeout
5 B Done
6 B Tied Covered
7 B Tied Done
8 B Tied Traded Away
9 B Traded Away
10 C No RFQ
11 D Covered
12 D Customer Reject
13 D Customer Timeout
14 D Dealer Reject
15 D Dealer Timeout
16 D Done
17 D Tied Covered
18 D Tied Done
19 D Tied Traded Away
20 D Traded Away
21 D No RFQ
电流输出:
df = df.groupby(['Client','Result']).agg({'Result': 'size'})
print(df)
Result
Client Result
A Covered 1
Customer Reject 1
Customer Timeout 1
B Dealer Reject 1
Dealer Timeout 1
Done 1
Tied Covered 1
Tied Done 1
Tied Traded Away 1
Traded Away 1
C No RFQ 1
D Covered 1
Customer Reject 1
Customer Timeout 1
Dealer Reject 1
Dealer Timeout 1
Done 1
No RFQ 1
Tied Covered 1
Tied Done 1
Tied Traded Away 1
Traded Away 1
期望输出:
a。分组间距
b。每个分组的总数
c。每个客户端的所有不存在的Result
都有一个零
注意Result
中可能的总值如下(11个字符串)。这些数据可能存在于当月的当前数据集中,也可能不存在于当前数据集中:
Covered
Customer Reject
Customer Timeout
Dealer Reject
Dealer Timeout
Done
Tied Covered
Tied Done
Tied Traded Away
Traded Away
No RFQ
Client Result Count
A Covered 1
A Customer Reject 1
A Customer Timeout 1
A Dealer Reject 0
A Dealer Timeout 0
A Done 0
A Tied Covered 0
A Tied Done 0
A Tied Traded Away 0
A Traded Away 0
A No RFQ 0
Total 3
Client Result Count
B Covered 0
B Customer Reject 0
B Customer Timeout 0
B Dealer Reject 1
B Dealer Timeout 1
B Done 1
B Tied Covered 1
B Tied Done 1
B Tied Traded Away 1
B Traded Away 1
A No RFQ 1
Total 8
Client Result Count
C Covered 0
C Customer Reject 0
C Customer Timeout 0
C Dealer Reject 0
C Dealer Timeout 0
C Done 0
C Tied Covered 0
C Tied Done 0
C Tied Traded Away 0
C Traded Away 0
C No RFQ 1
Total 1
Client Result Count
D Covered 1
D Customer Reject 1
D Customer Timeout 1
D Dealer Reject 1
D Dealer Timeout 1
D Done 1
D Tied Covered 1
D Tied Done 1
D Tied Traded Away 1
D Traded Away 1
D No RFQ 1
Total 11
您可以使用
pandas.get_dummies
后跟.groupby
:印刷品:
要打印数据,请执行以下操作:
印刷品:
要创建小计,我使用与Transform pandas groupby result with subtotals to relative values中相同的方法
可能还有另一种方法可以为原始数据中不存在的条目创建索引,但是
pivot_table
可以做到这一点产生
我正在通过打印添加空行。在数据帧中添加空记录似乎不是一个好主意,因为它们在重新排序时会消失
类似地,将总计行的记录更改为“总计”也会有问题,因为一旦数据帧被重新排序,您就不知道它们属于哪个组
稍微修改了代码,但没有pivot_表:
将
.reindex
与pd.MultiIndex.from_product
一起使用:设置:
代码:
要显示结果,请执行以下操作:
相关问题 更多 >
编程相关推荐