延迟datetimeindexed列的python方法

rng = pd.date_range('2012-01-01', '2013-1-01', freq="M") toy2 = pd.DataFrame(pd.Series(np.random.randint(0, 50, len(rng)), index=rng, name="foo")) foo 2012-01-31 4 2012-02-29 2 2012-03-31 27 2012-04-30 7 2012-05-31 44 2012-06-30 22 2012-07-31 16 2012-08-31 18 2012-09-30 35 2012-10-31 35 2012-11-30 16 2012-12-31 32 toy2['lag_foo']= toy2['foo'].shift(1,'m') foo lag_foo 2012-01-31 4 NaN 2012-02-29 2 4.0 2012-03-31 27 2.0 2012-04-30 7 27.0 2012-05-31 44 7.0 2012-06-30 22 44.0 2012-07-31 16 22.0 2012-08-31 18 16.0 2012-09-30 35 18.0 2012-10-31 35 35.0 2012-11-30 16 35.0 2012-12-31 32 16.0

ValueError Traceback (most recent call last) <ipython-input-170-9cb57a2ed681> in <module>() ----> 1 toy['prev_1m']= toy['IPE m2'].shift(1,'m') C:\Users\mds\Anaconda2\lib\site-packages\pandas\core\frame.pyc in __setitem__(self, key, value) 2355 else: 2356 # set column -> 2357 self._set_item(key, value) 2358 2359 def _setitem_slice(self, key, value): C:\Users\mds\Anaconda2\lib\site-packages\pandas\core\frame.pyc in _set_item(self, key, value) 2421 2422 self._ensure_valid_index(value) -> 2423 value = self._sanitize_column(key, value) 2424 NDFrame._set_item(self, key, value) 2425 C:\Users\mds\Anaconda2\lib\site-packages\pandas\core\frame.pyc in _sanitize_column(self, key, value) 2555 2556 if isinstance(value, Series): -> 2557 value = reindexer(value) 2558 2559 elif isinstance(value, DataFrame): C:\Users\mds\Anaconda2\lib\site-packages\pandas\core\frame.pyc in reindexer(value) 2547 # duplicate axis 2548 if not value.index.is_unique: -> 2549 raise e 2550 2551 # other ValueError: cannot reindex from a duplicate axis

toy.index DatetimeIndex(['2016-04-30', '2016-03-31', '2016-02-29', '2016-01-31', '2015-12-31', '2015-11-30', '2015-10-31', '2015-09-30', '2015-08-31', '2015-07-31', ... 'NaT', 'NaT', 'NaT', 'NaT', 'NaT', 'NaT', 'NaT', 'NaT', 'NaT', 'NaT'], dtype='datetime64[ns]', name=u'Date', length=142, freq=None) toy2.index DatetimeIndex(['2012-01-31', '2012-02-29', '2012-03-31', '2012-04-30', '2012-05-31', '2012-06-30', '2012-07-31', '2012-08-31', '2012-09-30', '2012-10-31', '2012-11-30', '2012-12-31'], dtype='datetime64[ns]', freq='M') In [ ]:

C:\Users\mds\Anaconda2\lib\site-packages\ipykernel\__main__.py:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy if __name__ == '__main__':

1条回答

网友

1楼 · 发布于 2024-10-03 19:28:04

还有另一个问题-toyDataFrame中的索引中有很多{}，所以{}有重复的值。（可能有些日期时间也被复制了。）

样品：

import pandas as pd
import numpy as np

rng = pd.date_range('2012-01-01', '2013-1-01', freq="M")
toy2 = pd.DataFrame(pd.Series(np.random.randint(0,  50, len(rng)), index=rng, name="foo"))

df = pd.DataFrame({'foo': [10,30,19]}, index=[np.nan, np.nan, np.nan])
print (df)
     foo
NaN   10
NaN   30
NaN   19

toy2 = pd.concat([toy2, df])
print (toy2)
            foo
2012-01-31   18
2012-02-29   34
2012-03-31   43
2012-04-30   17
2012-05-31   45
2012-06-30    8
2012-07-31   36
2012-08-31   26
2012-09-30    5
2012-10-31   18
2012-11-30   39
2012-12-31    3
NaT          10
NaT          30
NaT          19

toy2['lag_foo']= toy2['foo'].shift(1,'m')
print (toy2)

ValueError: cannot reindex from a duplicate axis

一种可能的解决方案是省略参数freq=m：

^{pr2}$

如果需要删除NaN（NaT）在index中的所有记录，请将^{}与^{}一起使用：

print (toy2)
            foo
2012-01-31   41
2012-02-29   15
2012-03-31    8
2012-04-30    2
2012-05-31   16
2012-06-30   43
2012-07-31    2
2012-08-31   15
2012-09-30    3
2012-10-31   46
2012-11-30   34
2012-12-31   36
NaT          10
NaT          30
NaT          19

toy2 = toy2[pd.notnull(toy2.index)]

toy2['lag_foo']= toy2['foo'].shift(1, 'm')
print (toy2)
            foo  lag_foo
2012-01-31   41      NaN
2012-02-29   15     41.0
2012-03-31    8     15.0
2012-04-30    2      8.0
2012-05-31   16      2.0
2012-06-30   43     16.0
2012-07-31    2     43.0
2012-08-31   15      2.0
2012-09-30    3     15.0
2012-10-31   46      3.0
2012-11-30   34     46.0
2012-12-31   36     34.0

在====

相关问题更多 >

编程相关推荐

热门问题

热门文章