如何在Python Pandas中将groupby DF更改为每月的datetime系列

2024-09-27 00:14:03 发布

您现在位置:Python中文网/ 问答频道 /正文

monthly_dividend
1994  10   NaN
      11   NaN
      12   NaN
      12   NaN
...
2012  4          NaN
      5          NaN
      6          NaN
      7     1.746622
      8     1.607685
      9     1.613936
      10    1.620187
      11    1.626125
      12    1.632375
2013  1     1.667792
      2     1.702897
      3     1.738314
      4     1.773731
      5     1.808835
      6     1.844252
Length: 225

我有类似于上面的代码。这是一个groupedbydataframe,不过我想再次将其转换成常规的TimeSeries。asfreq('M')不再适用于groupedby,所以我不确定是否有简单的方法来转换它。在

^{pr2}$

Tags: 方法代码nanlength常规dividendtimeseriesmonthly
1条回答
网友
1楼 · 发布于 2024-09-27 00:14:03

创建您的顶级数据

In [172]: df = DataFrame(randn(200,1),columns=['A'],index=pd.date_range('2000',periods=200,freq='M'))

In [173]: df['month'] = df.index.month

In [174]: df['year'] = df.index.year

In [175]: df = df.reset_index(drop=True).set_index(['year','month'])

In [176]: df
Out[176]: 
<class 'pandas.core.frame.DataFrame'>
MultiIndex: 200 entries, (2000, 7) to (2017, 2)
Data columns (total 1 columns):
A    200  non-null values
dtypes: float64(1)

In [177]: df.head()
Out[177]: 
                   A
year month          
2000 7      0.084256
     8      2.507213
     9     -0.642151
     10     1.972307
     11     0.926586

这将创建一个monthly freq的PeriodIndex。注意,在索引上迭代生成元组(作为整数)

^{pr2}$

与DateTimeIndex的直接对话

In [180]: new_index = pd.PeriodIndex([ pd.Period(year=year,month=month,freq='M') for year, month in df.index ]).to_timestamp()
Out[180]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2000-07-01 00:00:00, ..., 2017-02-01 00:00:00]
Length: 200, Freq: MS, Timezone: None

现在你可以做了

In [182]: df.index = new_index

In [183]: df
Out[183]: 
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 200 entries, 2000-07-01 00:00:00 to 2017-02-01 00:00:00
Freq: MS
Data columns (total 1 columns):
A    200  non-null values
dtypes: float64(1)

In [184]: df.head()
Out[184]: 
                   A
2000-07-01  0.084256
2000-08-01  2.507213
2000-09-01 -0.642151
2000-10-01  1.972307
2000-11-01  0.926586

to_timestamp通常返回月的第一天 要返回结尾,请传递how='e'

In [1]: pr = pd.period_range('200001',periods=20,freq='M')

In [2]: pr
Out[2]: 
<class 'pandas.tseries.period.PeriodIndex'>
freq: M
[2000-01, ..., 2001-08]
length: 20

In [3]: pr.to_timestamp()
Out[3]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2000-01-01 00:00:00, ..., 2001-08-01 00:00:00]
Length: 20, Freq: MS, Timezone: None

In [4]: pr.to_timestamp(how='e')
Out[4]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2000-01-31 00:00:00, ..., 2001-08-31 00:00:00]
Length: 20, Freq: M, Timezone: None

相关问题 更多 >

    热门问题