不同参考发布日期时间序列数据的Pandas多级索引

2024-09-30 01:35:02 发布

您现在位置:Python中文网/ 问答频道 /正文

我的会计数据既有参考日期(即财政季度结束日期)也有发布日期(即实际收入公布的时间)。下面是一个示例:

              item  Reference    Value VALUED  FQTR  FYEARQ
Published                                                  
1986-12-14   CAPXY 1983-12-31   13.820      3     1    1984
1986-12-14   CAPXY 1984-03-31   20.895      3     2    1984
1986-12-14   CAPXY 1984-06-30   26.764      3     3    1984
1986-12-14   CAPXY 1984-09-30   39.614      3     4    1984
1986-12-14   CAPXY 1984-12-31   15.056      3     1    1985
1986-12-14   CAPXY 1985-03-31   33.604      3     2    1985
1986-12-14   CAPXY 1985-06-30   42.719      3     3    1985
1986-12-14   CAPXY 1985-09-30   54.064      3     4    1985
1986-12-14   CAPXY 1985-12-31    6.510      3     1    1986
1986-12-14   CAPXY 1986-03-31   18.503      3     2    1986
1986-12-14   CAPXY 1986-06-30   48.071      3     3    1986
1987-01-31   CAPXY 1986-09-30   66.629      2     4    1986
1987-01-31   CAPXY 1986-09-30   66.629      3     4    1986
1987-03-31   CAPXY 1986-12-31   15.740      2     1    1987
1987-03-31   CAPXY 1986-12-31   15.740      3     1    1987
1987-05-31   CAPXY 1987-03-31   38.699      2     2    1987
1987-05-31   CAPXY 1987-03-31   38.699      3     2    1987
1987-08-31   CAPXY 1987-06-30   61.006      2     3    1987
1987-08-31   CAPXY 1987-06-30   61.006      3     3    1987
1987-12-31   CAPXY 1987-09-30   86.127      2     4    1987
1987-12-31   CAPXY 1987-09-30   86.127      3     4    1987
1988-03-31   CAPXY 1987-12-31   34.140      2     1    1988
1988-03-31   CAPXY 1987-12-31   34.140      3     1    1988
1988-06-09   CAPXY 1988-03-31   68.059      2     2    1988
1988-06-09   CAPXY 1988-03-31   68.059      3     2    1988
1988-09-08   CAPXY 1988-06-30  101.198      2     3    1988
1988-09-08   CAPXY 1988-06-30  101.198      3     3    1988
1988-12-30   CAPXY 1988-09-30  144.001      2     4    1988
1988-12-30   CAPXY 1988-09-30  144.001      3     4    1988
1989-03-09   CAPXY 1988-12-31   73.967      2     1    1989
...            ...        ...      ...    ...   ...     ...
2001-08-16  OANCFY 2001-06-30  -90.000      2     3    2001
2001-08-16  OANCFY 2001-06-30  -90.000      3     3    2001
2002-01-10  OANCFY 2001-09-30  185.000      2     4    2001
2002-01-10  OANCFY 2001-09-30  185.000      3     4    2001
2002-02-14  OANCFY 2001-12-31   42.000      2     1    2002
2002-02-14  OANCFY 2001-12-31   42.000      3     1    2002
2002-05-23  OANCFY 2002-03-31   44.000      2     2    2002
2002-05-23  OANCFY 2002-03-31   44.000      3     2    2002
2002-08-15  OANCFY 2002-06-30    7.000      2     3    2002
2002-08-15  OANCFY 2002-06-30    7.000      3     3    2002
2002-12-31  OANCFY 2002-09-30   89.000      2     4    2002
2002-12-31  OANCFY 2002-09-30   89.000      3     4    2002
2003-02-13  OANCFY 2002-12-31  110.000      2     1    2003
2003-02-13  OANCFY 2002-12-31  110.000      3     1    2003
2003-05-22  OANCFY 2003-03-31  208.000      2     2    2003
2003-05-22  OANCFY 2003-03-31  208.000      3     2    2003
2003-08-21  OANCFY 2003-06-30  216.000      3     3    2003
2003-08-21  OANCFY 2003-06-30  216.000      2     3    2003
2003-12-31  OANCFY 2003-09-30  289.000      2     4    2003
2003-12-31  OANCFY 2003-09-30  289.000      3     4    2003
2004-02-19  OANCFY 2003-12-31  219.000      2     1    2004
2004-02-19  OANCFY 2003-12-31  219.000      3     1    2004
2004-05-20  OANCFY 2004-03-31  280.000      2     2    2004
2004-05-20  OANCFY 2004-03-31  280.000      3     2    2004
2004-08-19  OANCFY 2004-06-30  491.000      2     3    2004
2004-08-19  OANCFY 2004-06-30  491.000      3     3    2004
2004-12-16  OANCFY 2004-09-30  934.000      2     4    2004
2004-12-16  OANCFY 2004-09-30  934.000      3     4    2004
2005-02-10  OANCFY 2004-12-31  775.000      2     1    2005
2005-02-10  OANCFY 2004-12-31  775.000      3     1    2005

[396 rows x 6 columns]

数据通过熊猫.io.sql。将sql读取到数据帧 问题在于,索引取决于具体情况,用户是按参考日期还是发布日期请求数据。然后我需要透视数据并将每个项目显示为一个列,其中包含引用/发布日期的多级索引。。一个发布日期可以有多个重复的引用日期。在

我想了想:

^{pr2}$

但是我在创建多索引数据帧时遇到以下错误:

TypeError: 'NoneType' object is not iterable

我想这是一个相当常见的问题,参考日期和发布日期不一致,但我似乎找不到一个优雅的解决方案。在

有什么想法吗?在

从Alexander的评论中,经过一个轻微的索引调整,我在寻找类似于:

df.reset_index().set_index(['Reference','Published'])

那么,也许这(为了说明目的):

pd.concat(df[df['item'] == 'CAPXY']), df[df['item'] == 'OANCFY'])

但我得到了以下错误:

TypeError: first argument must be an iterable of pandas objects, you passed an object of type "DataFrame"

然后会返回:

                         item    Value VALUED  FQTR  FYEARQ
Reference  Published                                       
1983-12-31 1986-12-14   CAPXY   13.820      3     1    1984
1984-03-31 1986-12-14   CAPXY   20.895      3     2    1984
1984-06-30 1986-12-14   CAPXY   26.764      3     3    1984
1984-09-30 1986-12-14   CAPXY   39.614      3     4    1984
1984-12-31 1986-12-14   CAPXY   15.056      3     1    1985

但我希望能达成以下目标:

First:                  CAPXY                        OANCFY
Second:                 Value VALUED  FQTR  FYEARQ   Value   VALUED   FQTR  FYEARQ
Reference  Published                                       
1983-12-31 1986-12-14   13.820      3     1    1984
1984-03-31 1986-12-14   20.895      3     2    1984
1984-06-30 1986-12-14   26.764      3     3    1984
1984-09-30 1986-12-14   39.614      3     4    1984
1984-12-31 1986-12-14   15.056      3     1    1985

publish和publish列都是基于左对齐的引用


Tags: 数据dfsqlobjectvalue错误itemreference
1条回答
网友
1楼 · 发布于 2024-09-30 01:35:02

根据数据框的打印方式,它看起来像是当前已发布的索引。您需要重置索引,然后将数据帧重新索引为: a) item b) Reference c) Published

>>> df.reset_index().set_index(['item', 'Reference', 'Published'])
                       index    Value  VALUED  FQTR  FYEARQ
item  Reference Published                                      
CAPXY 12/31/83  12/14/86       0   13.820       3     1    1984
      3/31/84   12/14/86       1   20.895       3     2    1984
      6/30/84   12/14/86       2   26.764       3     3    1984
      9/30/84   12/14/86       3   39.614       3     4    1984
      12/31/84  12/14/86       4   15.056       3     1    1985
      3/31/85   12/14/86       5   33.604       3     2    1985

编辑:

根据修改后的帖子,我相信一个数据透视表就可以了。我还交换列级别以获得所需的格式。在

注意,如果日期是字符串,则需要将日期转换为日期时间对象(或时间戳)。在

^{pr2}$

作为一种选择,您可以尝试groupby,因为每个索引的所有数据都是唯一的。在

^{3}$

相关问题 更多 >

    热门问题