<p>使用<code>to_datetime</code>并传递<code>unit='s'</code>将单元解析为unix时间戳,这将更快:</p>
<pre><code>In [7]:
pd.to_datetime(df.index, unit='s')
Out[7]:
DatetimeIndex(['2015-12-02 11:02:16.830000', '2015-12-02 11:02:17.430000',
'2015-12-02 11:02:18.040000', '2015-12-02 11:02:18.650000',
'2015-12-02 11:02:19.250000'],
dtype='datetime64[ns]', name=0, freq=None)
</code></pre>
<p><strong>计时</strong>:</p>
<pre><code>In [9]:
import time
%%timeit
import time
def date_parser(string_list):
return [time.ctime(float(x)) for x in string_list]
df = pd.read_csv(io.StringIO(t), parse_dates=[0], sep=';',
date_parser=date_parser,
index_col='DateTime',
names=['DateTime', 'X'], header=None)
100 loops, best of 3: 4.07 ms per loop
</code></pre>
<p>以及</p>
<pre><code>In [12]:
%%timeit
t="""1449054136.83;15.31
1449054137.43;16.19
1449054138.04;19.22
1449054138.65;15.12
1449054139.25;13.12"""
df = pd.read_csv(io.StringIO(t), header=None, sep=';', index_col=[0])
df.index = pd.to_datetime(df.index, unit='s')
100 loops, best of 3: 1.69 ms per loop
</code></pre>
<p>因此,在这个小数据集上使用<code>to_datetime</code>要快2倍多,我希望这比其他方法的伸缩性要好得多</p>