当字符串包含decim时，numpy的loadtxt在转换为int时出错

import numpy as np from StringIO import StringIO in1 = StringIO("123 456 789\n231 543 876") a = np.loadtxt(in1, dtype=[('x', "int"), ('y', "int"), ('z', "int")]) ####output array([(123, 456, 789), (231, 543, 876)], dtype=[('x', '<i8'), ('y', '<i8'), ('z', '<i8')])

2条回答

网友

1楼 · 编辑于 2024-10-01 13:44:21

无需手动编辑任何内容：

>>> in2 = StringIO("123 456 789\n231 543.0 876")
>>> dt_temp = np.dtype([('x', "int"), ('y', "float"), ('z', "int")])
>>> a = np.loadtxt(in2, dtype=dt_temp)
>>> 
>>> dt = np.dtype([('x', "int"), ('y', "int"), ('z', "int")])
>>> b = a.astype(dt)
>>> b
array([(123, 456, 789), (231, 543, 876)], 
      dtype=[('x', '<i8'), ('y', '<i8'), ('z', '<i8')])

网友

2楼 · 编辑于 2024-10-01 13:44:21

对于应该是整数的字段，可以使用int(float(fieldval))的转换器。下面显示了一种基于数据类型以编程方式创建loadtxtconverters参数的方法：

In [77]: in3 = StringIO("123.0 456 789 0.95\n231 543.0 876 0.87")

In [78]: dt = dtype([('x', "int"), ('y', "int"), ('z', "int"), ('r', "float")])

In [79]: converters = dict((k, lambda s: int(float(s))) for k in range(len(dt)) if np.issubdtype(dt[k], np.integer))

In [80]: converters
Out[80]: 
{0: <function __main__.<lambda>>,
 1: <function __main__.<lambda>>,
 2: <function __main__.<lambda>>}

In [81]: a = np.loadtxt(in3, dtype=dt, converters=converters)

In [82]: a
Out[82]: 
array([(123, 456, 789, 0.95), (231, 543, 876, 0.87)], 
      dtype=[('x', '<i8'), ('y', '<i8'), ('z', '<i8'), ('r', '<f8')])

即使这样，在2gig文件上使用loadtxt时，仍然可能遇到性能或内存问题。你查过pandas了吗？它的csv阅读器比numpy的阅读器快得多。在

相关问题更多 >

编程相关推荐

热门问题

热门文章