Python：快速循环np.数组

3条回答

网友

1楼 · 编辑于 2024-09-24 00:35:23

使用with open时，不要自己关闭它。with上下文自动执行此操作。我还更改了通用的array名称，使其隐藏其他内容的风险更小（比如np.array？）在

with open("file.dat", "rb") as f:
    data = np.fromfile(f, dtype=np.float32)

首先不需要在np.array中包装np.zeros。它已经是一个数组。^如果data是1d，则{}是可以的，但是我更喜欢使用shape元组。在

^{pr2}$

布尔索引/遮罩允许您同时对整个数组执行操作：

mask = data != -9.99e8   # don't need `float` here
                         # using != test with floats is poor idea
data[mask] -= 273.15

我需要改进!=测试。它适用于整数，但不适用于浮点。像np.abs(data+9.99e8)>1这样的东西更好

类似地，in也不是一个好的float测试。对于整数，in和{}执行冗余工作。在

假设temps是1d，则np.where(...)返回一个1元素元组。[0]选择该元素，返回一个数组。然后,在^{中是冗余的。index, = np.where()没有[0]应该可以工作。在

根据数组初始化的方式，T_SLR[i]已为0。不用再设置了。在

for i in range(0,len(array)):
    if array[i] in temps:
        index, = np.where(temps==array[i])[0]
        T_SLR = slr[index]
    else:
        T_SLR[i] = 0.00

但我认为我们也可以摆脱这种迭代。但我会把讨论留到以后再说。在

In [461]: temps=np.arange(-30.00,0.01,0.01, dtype='float32')
In [462]: temps
Out[462]: 
array([ -3.00000000e+01,  -2.99899998e+01,  -2.99799995e+01, ...,
        -1.93138123e-02,  -9.31358337e-03,   6.86645508e-04], dtype=float32)
In [463]: temps.shape
Out[463]: (3001,)

难怪做array[i] in temps和{}很慢

我们可以把in去掉，看看where

In [464]: np.where(temps==12.34)
Out[464]: (array([], dtype=int32),)
In [465]: np.where(temps==temps[3])
Out[465]: (array([3], dtype=int32),)

如果没有匹配项，where返回一个空数组。在

In [466]: idx,=np.where(temps==temps[3])
In [467]: idx.shape
Out[467]: (1,)
In [468]: idx,=np.where(temps==123.34)
In [469]: idx.shape
Out[469]: (0,)

如果匹配项在列表的早期，则in可能比where快，但如果不是更慢，则匹配在结束时，或者没有匹配。在

In [478]: timeit np.where(temps==temps[-1])[0].shape[0]>0
10000 loops, best of 3: 35.6 µs per loop
In [479]: timeit temps[-1] in temps
10000 loops, best of 3: 39.9 µs per loop

四舍五入法：

In [487]: (np.round(temps,2)/.01).astype(int)
Out[487]: array([-3000, -2999, -2998, ...,    -2,    -1,     0])

我建议调整一下：

T_SLR = -np.round(data, 2)/.01).astype(int)

网友

2楼 · 编辑于 2024-09-24 00:35:23

因为temps是排序的，所以可以使用np.searchsorted并避免所有显式循环：

array[array != float(-9.99e+08)] -= 273.15
indices = np.searchsorted(temps, array)
# Remove indices out of bounds
mask = indices < array.shape[0]
# Remove in-bounds indices not matching exactly
mask[mask] &= temps[indices[mask]] != array[mask]
T_SLR = np.where(mask, slr[indices[mask]], 0)

网友

3楼 · 编辑于 2024-09-24 00:35:23

代码中最慢的一点是对列表的O（n）遍历：

if array[i] in temps:
    index, = np.where(temps==array[i])[0]

由于temps不大，可以将其转换为dict：

^{pr2}$

把它变成O（1）：

if array[i] in temps2:
    index = temps2[array[i]]

您也可以尝试避免for循环来加快速度。例如，以下代码：

for i in range(0,len(array)):
    if array[i] != float(-9.99e+08):
        array[i] = array[i] - 273.15

可以这样做：

array[array!=float(-9.99e+08)] -= 273.15

代码中的另一个问题是浮点比较。不应该使用完全相等的运算符==或{}，尝试使用带公差的numpy.isclose，或者通过乘以100将float转换为int。在

相关问题更多 >

编程相关推荐

热门问题

热门文章