Try catch块,用于在中创建摘要计数

2024-10-03 02:33:28 发布

您现在位置:Python中文网/ 问答频道 /正文

我想创建一个摘要数据帧,反映跟踪和未跟踪框的数量。简单:

                       School - Exams Tracked    School - Exams Not Tracked
All Box Tracked Sites                    5820                             2

我们将在下车时使用此报告,因此有时将没有跟踪的箱子,一段时间后将跟踪所有箱子

现在,我的代码可能会收到一个键错误(.get_loc(key)),因为有时它会查找目前还不存在的“TRACKED”

这是我想出的最好的解决办法,但我觉得很难看:

BoxTrackingSummary_df = pd.DataFrame()
BoxTrackingSummary_df_columns = ['School - Exams Tracked', 'School - Exams Not Tracked']

summary_group = pd.DataFrame(BoxTrackingReport_df.groupby('Tracked At A Site?').agg('count')['All Box Tracked Sites'])

# group.loc can only count groups that exist. plan for when there are no 'TRACKED' or no 'NO's, or receive a .get_loc(key) error
try:
    BoxTrackingSummary_df['School - Exams Tracked'] = summary_group.loc['TRACKED']
except:
    BoxTrackingSummary_df['School - Exams Tracked'] = 0
    print('No Tracked yet.')

try:
    BoxTrackingSummary_df['School - Exams Not Tracked'] = summary_group.loc['NO']
except:
    BoxTrackingSummary_df['School - Exams Not Tracked'] = 0
    print('All Tracked.')

这是报告栏“在站点跟踪”的内容:

>>> BoxTrackingReport_df['Tracked At A Site?']
...
0       TRACKED
1       TRACKED
2       TRACKED
3       TRACKED
4       TRACKED

Tags: boxdf报告groupnotallsummaryloc
1条回答
网友
1楼 · 发布于 2024-10-03 02:33:28

不需要try/except或初始化空数据帧并从单独的groupby数据帧分配列。考虑直接从一个站点跟踪工作列(即系列):

BoxTrackingSummary_df = (BoxTrackingReport_df['Tracked At A Site?'] 
                             .rename('All Box Tracked Sites')
                             .value_counts()
                             .to_frame()
                             .transpose()
                             .reindex(columns=['TRACKED', 'NO'])  
                             .fillna(0)
                             .set_axis(['School - Exams Tracked', 'School - Exams Not Tracked'], 
                                        axis='columns', inplace=False)
                        )

用随机的种子数据演示

import numpy as np
import pandas as pd

np.random.seed(882019)
BoxTrackingReport_df = pd.DataFrame({'Tracked At A Site?': np.random.choice(['TRACKED', 'NO'], 500)})
...

print(BoxTrackingSummary_df)

#                        School - Exams Tracked  School - Exams Not Tracked
# All Box Tracked Sites                     251                         249

在上面的reindex中,代码总是确保两个列都出现,不管它们是否在数据中(添加.fillna(0)

BoxTrackingReport_df = pd.DataFrame({'Tracked At A Site?': np.repeat(['TRACKED'], 500)})
...

print(BoxTrackingSummary_df)

#                        School - Exams Tracked  School - Exams Not Tracked
# All Box Tracked Sites                     500                         0.0

相关问题 更多 >