是什么导致python中groupby和transform.count()操作的计数错误

2024-09-30 04:39:47 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在分组并指望我的数据帧

这是我从.descripe()方法得到的结果:

而其他所有的度量都是4。事实上,这个组中只有4个条形码,所以计数应该是5。怎么可能计数是5

invoice_number        barcode
OFF1540673            4054673005837  count                                   5.0
                                     mean                                    4.0
                                     std                                     0.0
                                     min                                     4.0
                                     25%                                     4.0
                                     50%                                     4.0
                                     75%                                     4.0
                                     max                                     4.0
                      4054673034394  count                                   5.0
                                     mean                                    4.0
                                     std                                     0.0
                                     min                                     4.0
                                     25%                                     4.0
                                     50%                                     4.0
                                     75%                                     4.0
                                     max                                     4.0
                      4054673238488  count                                   5.0
                                     mean                                    4.0
                                     std                                     0.0
                                     min                                     4.0
                                     25%                                     4.0
                                     50%                                     4.0
                                     75%                                     4.0
                                     max                                     4.0
                      4054673238822  count                                   5.0
                                     mean                                    4.0
                                     std                                     0.0
                                     min                                     4.0
                                     25%                                     4.0
                                     50%                                     4.0
                                     75%                                     4.0
                                     max                                     4.0

更新

原始数据集

              invoice_number  barcode
327378            OFF1540673  4054673238488
327379            OFF1540673  4054673034394
327380            OFF1540673  4054673238822
327381            OFF1540673  4054673005837
327382            OFF1540673  4054673238488
327383            OFF1540673  4054673034394
327384            OFF1540673  4054673238822
327385            OFF1540673  4054673005837
327386            OFF1540673  4054673238488
327387            OFF1540673  4054673034394
327388            OFF1540673  4054673238822
327389            OFF1540673  4054673005837
327390            OFF1540673  4054673238488
327391            OFF1540673  4054673034394
327392            OFF1540673  4054673238822
327393            OFF1540673  4054673005837
327394            OFF1540673  4054673238488
327395            OFF1540673  4054673034394
327396            OFF1540673  4054673238822
327397            OFF1540673  4054673005837

两列的数据类型都是“object”

这是命令组

打印数据.groupby(['invoice\u number','barcode'])['invoice\u number'].description()


Tags: 数据方法number度量countinvoiceminmean
1条回答
网友
1楼 · 发布于 2024-09-30 04:39:47

I want to know the distinct number of barcodes per order number

In [30]: df.groupby('invoice_number')['barcode'].nunique()
Out[30]:
invoice_number
OFF1540673    4
Name: barcode, dtype: int64

更新:无法使用提供的数据集重现您的问题:

In [16]: df.groupby(['invoice_number','barcode'])['invoice_number'].describe()
Out[16]:
invoice_number  barcode
OFF1540673      4054673005837  count              5
                               unique             1
                               top       OFF1540673
                               freq               5
                4054673034394  count              5
                               unique             1
                               top       OFF1540673
                               freq               5
                4054673238488  count              5
                               unique             1
                               top       OFF1540673
                               freq               5
                4054673238822  count              5
                               unique             1
                               top       OFF1540673
                               freq               5
Name: invoice_number, dtype: object

In [17]: df.groupby(['invoice_number','barcode'])['invoice_number'].count()
Out[17]:
invoice_number  barcode
OFF1540673      4054673005837    5
                4054673034394    5
                4054673238488    5
                4054673238822    5
Name: invoice_number, dtype: int64

相关问题 更多 >

    热门问题