通过按分组计算数据帧中值的差异

2024-09-29 23:17:27 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据帧,格式如下:

  station   num_bikes   Rush hour? num_racks hour
  Botanic   3           yes-am     9         9
  Botanic   2           no         10        14
  Botanic   10          no         2         20
  Queens    6           no         10        5
  Queens    10          yes-pm     6         18
  Queens    12          yes-pm     4         19
  Queens    1           no         15        7

num\u bikes是该站点可用的自行车数量,num\u racks是可用的机架数量。我试图计算每个站点的自行车到达和离开的总数,以确定事务总数。我使用的代码产生错误:

ValueError: Wrong number of items passed 0, placement implies 1

代码是:

df_filtered['diff'] = df_filtered.groupby(['Rush hour?', 'station'])      [['num_bikes']].diff()

预期产量:

  station     Rush hour?  arrivals  departures
  Botanic     yes-am      0         0
  Botanic     no          8         0
  Queens      no          0         5
  Queens      yes-pm      0         2

我的代码怎么了


Tags: no代码站点自行车amnumyesqueens
1条回答
网友
1楼 · 发布于 2024-09-29 23:17:27

以下是我尝试过的,如果不合适请告诉我:

import pandas as pd
import numpy as np

df_filtered = pd.DataFrame([
    ('Botanic' ,  3      ,     'yes-am' ,    9    ,     9),
  ('Botanic'  , 2       ,    'no'        , 10     ,   14),
  ('Botanic'  , 10     ,     'no'        , 2      ,   20),
  ('Queens'   , 6     ,      'no'       ,  10     ,   5),
  ('Queens'   , 10   ,       'yes-pm'   ,  6      ,   18),
  ('Queens'   , 12  ,        'yes-pm'   ,  4      ,   19),
  ('Queens'   , 1  ,         'no'       ,  15     ,   7)
])

df_filtered.columns = ['station',   'num_bikes',   'Rush hour?', 'num_racks', 'hour']

df_filtered['diff'] = df_filtered['num_bikes'].diff().fillna(0)
df_filtered['arrivals'] = df_filtered['diff'][df_filtered['diff'] > 0]
df_filtered['departures'] = df_filtered['diff'][df_filtered['diff'] < 0]
df_filtered.drop(columns='diff', inplace=True)
df_filtered[['departures','arrivals']] = df_filtered[['departures','arrivals']].astype(float).fillna(0)
df_filtered.groupby(['Rush hour?', 'station'])[['arrivals','departures','num_bikes']].sum()

Dataframe Screenshot

这些groupby结果可能不会保留输入数据帧的原始顺序,因此看起来可能很混乱,但这些是作为行组快照的到达/离开的净结果

相关问题 更多 >

    热门问题