当单独指定数据类型时,genfromtxt的行为异常

2024-10-05 14:25:36 发布

您现在位置:Python中文网/ 问答频道 /正文

我当前正在加载以下文件: 1400,,,20011011000,1,07,08332,8,2,,,,1,9,21,36,39,53,68,95,,,,,0,8,,, 1400,,,20011011000,2,07,08222,11,1,,,1,1,2,12,13,21,48,112,,,,,0,11,,, 1400,,,20011011001,1,07,08,24,0,0,,,0,1,3,7,2,3,3,5,,,,,,0,0,,, 1400,,,20011011001,2,07,08,14,0,0,,,,0,0,0,3,1,4,0,6,,,,,0,0,,, 1400,,,20011011002,1,07,08,0,0,0,,,,0,0,0,0,0,0,0,0,0,0,,,,,,0,0,,, 1402,,,2001101,I25,1,07,08,0,0,0,,,,0,0,0,0,0,0,0,0,,,,,,0,0,,, 1401,,,2001101,I26,2,07,08,0,0,0,,,,0,0,0,0,0,0,0,0,,,,,,0,0

所有列都应该是int,而不是我设置为字符串的第6列(值如1000、I25)。我按如下方式加载文件:

data = np.genfromtxt(sys.argv[1], dtype=(int,int,int,int,int,"|S25",int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int), skip_header=1, delimiter=",")

我必须这样做的原因是,否则它会认为所有内容都是int,并将第6列设置为-1。在

然后,我设置了一个遮罩,以便只打印设置为1400的行:

^{pr2}$

但是,这会产生错误:

Traceback (most recent call last):
  File "Python/iw2.py", line 14, in <module>
    mask_country = (data[:,0] == 1400)
IndexError: too many indices

这很奇怪,因为如果我从genfromtxt行中去掉了dtype=(),或者只指定了所有变量,就像在dtype=int中一样,它运行得很好。在

为什么单独指定列的数据类型会导致此错误?在

如果我不设置掩码,我可以打印“数据”,它似乎设置正确,最后一行如下:

(1401, -1, -1, 2001, 101, 'I26', 2, 7, 8, 0, 0, 0, -1, -1, -1, 0, -1, 0, -1, 0, -1, 0, -1, 0, -1, 0, -1, 0, -1, 0, -1, -1, -1, -1, 0, 0, -1, -1, -1)]

Tags: 文件字符串data错误npsys方式int