Pandas周期序列值的分组 - 问答 - Python中文网

Pandas周期序列值的分组

2024-09-27 09:29:02 发布

您现在位置：Python中文网/ 问答频道 /正文

男 | 程序猿一只，喜欢编程写python代码。

从Reading CSV file in Pandas with historical dates开始，我有一些CSV数据的形式：

Object,Earliest Date
Object1,01/01/2000
Object2,01/01/1760
Object3,01/01/1520
...

我现在读到了熊猫（使用句点来处理历史日期）并创建了一个系列。我试图把这个系列分成几十年，但在将周期值转换成groupby期望的形式时，我遇到了一些困难。到目前为止，我已经尝试过（其中s是从\u csv创建的系列）：

^{pr2}$

失败的原因是：

 TypeError: Argument 'labels' has incorrect type (expected numpy.ndarray, got TimeGrouper)

尝试将其作为一个系列进行分组：

 decades = s2.groupby(pd.Grouper(freq="120M")).count()

失败原因：

 TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Index'

尝试将其作为数据帧分组：

df = pd.DataFrame(s2)
decades = df.groupby(pd.Grouper(freq="120M", key='Earliest Date')).size()

失败原因：

AttributeError: 'Index' object has no attribute 'to_timestamp'

不知道还能怎么做？！在

Tags： csv 数据 date with 原因形式 pd has

1条回答

网友

1楼 · 发布于 2024-09-27 09:29:02

错误消息和pandas文档将成为您的好友。在

我不知道您的日期列是否包含严格唯一的日期。如果是的话，这很简单，只需将其用作索引，您就可以使用pd.Grouper。否则，请定义自己的分组函数：

def grouper(ind):
    y = df.loc[ind]['Earliest Date'].year 
    return y - (y % 10)

# I'm assuming that df is the dataframe from pd.read_csv("/path/to/csv")
# and that there's a column named "earliest date" 
# that is a Period or Datetime or something with a year attribute
gb = df.groupby(by=grouper)
print(gb.size())

相关问题更多 >

编程相关推荐

热门问题

热门文章