使用pandas指定dtype float32。在pandas 0.10.1上读取csv

>>> cat test.out a b 0.76398 0.81394 0.32136 0.91063 >>> import pandas >>> import numpy >>> x = pandas.read_csv('test.out', dtype={'a': numpy.float32}, delim_whitespace=True) >>> x a b 0 0.76398 0.81394 1 0.32136 0.91063 >>> x.a.dtype dtype('float64')

>>> !uname -a Linux ubuntu 3.0.0-13-generic #22-Ubuntu SMP Wed Nov 2 13:25:36 UTC 2011 i686 i686 i386 GNU/Linux >>> import platform >>> platform.architecture() ('32bit', 'ELF') >>> pandas.__version__ '0.10.1'

2条回答

网友

1楼 · 编辑于 2024-09-29 00:18:15

In [22]: df.a.dtype = pd.np.float32

In [23]: df.a.dtype
Out[23]: dtype('float32')

在熊猫0.10.1的情况下，上面的方法对我来说很好

网友

2楼 · 编辑于 2024-09-29 00:18:15

0.10.1不太支持float32

看这个http://pandas.pydata.org/pandas-docs/dev/whatsnew.html#dtype-specification

您可以在0.11中这样做：

# dont' use dtype converters explicity for the columns you care about
# they will be converted to float64 if possible, or object if they cannot
df = pd.read_csv('test.csv'.....)

#### this is optional and related to the issue you posted ####
# force anything that is not a numeric to nan
# columns are the list of columns that you are interesetd in
df[columns] = df[columns].convert_objects(convert_numeric=True)


    # astype
    df[columns] = df[columns].astype('float32')

see http://pandas.pydata.org/pandas-docs/dev/basics.html#object-conversion

Its not as efficient as doing it directly in read_csv (but that requires
 some low-level changes)

我已经确认，使用0.11-dev，这确实有效（在32位和64位上，结果相同）

In [5]: x = pd.read_csv(StringIO.StringIO(data), dtype={'a': np.float32}, delim_whitespace=True)

In [6]: x
Out[6]: 
         a        b
0  0.76398  0.81394
1  0.32136  0.91063

In [7]: x.dtypes
Out[7]: 
a    float32
b    float64
dtype: object

In [8]: pd.__version__
Out[8]: '0.11.0.dev-385ff82'

In [9]: quit()
vagrant@precise32:~/pandas$ uname -a
Linux precise32 3.2.0-23-generic-pae #36-Ubuntu SMP Tue Apr 10 22:19:09 UTC 2012 i686 i686 i386 GNU/Linux

相关问题更多 >

编程相关推荐

热门问题

热门文章