<p>将<code>np.timedelta64(1, 'Y')</code>添加到数据类型<code>datetime64[ns]</code>的数组中无效,因为一年并不对应于固定的纳秒数。有时一年是365天,有时是366天,有时甚至有额外的闰秒。(请注意,额外的闰秒,例如发生在2015-06-30 23:59:60的闰秒不能表示为NumPy datetime64s。)</p>
<p>我知道的向NumPy <code>datetime64[ns]</code>数组添加一年的最简单方法是将其分成若干部分,例如年、月和日,在整数数组上进行计算,然后重新组合datetime64数组:</p>
<pre><code>def year(dates):
"Return an array of the years given an array of datetime64s"
return dates.astype('M8[Y]').astype('i8') + 1970
def month(dates):
"Return an array of the months given an array of datetime64s"
return dates.astype('M8[M]').astype('i8') % 12 + 1
def day(dates):
"Return an array of the days of the month given an array of datetime64s"
return (dates - dates.astype('M8[M]')) / np.timedelta64(1, 'D') + 1
def combine64(years, months=1, days=1, weeks=None, hours=None, minutes=None,
seconds=None, milliseconds=None, microseconds=None, nanoseconds=None):
years = np.asarray(years) - 1970
months = np.asarray(months) - 1
days = np.asarray(days) - 1
types = ('<M8[Y]', '<m8[M]', '<m8[D]', '<m8[W]', '<m8[h]',
'<m8[m]', '<m8[s]', '<m8[ms]', '<m8[us]', '<m8[ns]')
vals = (years, months, days, weeks, hours, minutes, seconds,
milliseconds, microseconds, nanoseconds)
return sum(np.asarray(v, dtype=t) for t, v in zip(types, vals)
if v is not None)
# break the datetime64 array into constituent parts
years, months, days = [f(dates_np) for f in (year, month, day)]
# recompose the datetime64 array after adding 1 to the years
dates3 = combine64(years+1, months, days)
</code></pre>
<p>收益率</p>
^{pr2}$
<p>尽管看起来有这么多代码,但实际上比添加日期偏移量1年要快:</p>
<pre><code>In [206]: %timeit dates + DateOffset(years=1)
1 loops, best of 3: 285 ms per loop
In [207]: %%timeit
.....: years, months, days = [f(dates_np) for f in (year, month, day)]
.....: combine64(years+1, months, days)
.....:
100 loops, best of 3: 2.65 ms per loop
</code></pre>
<p>当然,<a href="http://pandas.pydata.org/pandas-docs/stable/timeseries.html#dateoffset-objects" rel="noreferrer">pd.tseries.offsets</a>提供了一整套偏移量,在使用NumPy datetime64时,这些偏移量没有简单的对应项</p>