在某些条件下，如何有效地使用numpy遍历数组来查找模式？

import numpy as np arr = np.array([0,0,1,0,1,0,0,0,0,1,1,1,0,0,0,0,0,1,0,0,0,1,0,0,1,1,1,1,0,1,0,1,0,0,0,0,0,1,0,1,1,0,1,1,0,1,1,0,0,0]) patternFound = False threshold = 3 nonzerosCount = 0 zerosCount = 0 split_indexes=[] for i in range(len(arr)): if patternFound: if arr[i] <= 0: zerosCount += 1 else: zerosCount = 0 if zerosCount >= threshold and i+1 != len(arr): zerosCount = 0 patternFound=False split_indexes.append(i+1) else: if arr[i] >= 1: nonzerosCount += 1 else: nonzerosCount = 0 if nonzerosCount >= threshold: nonzerosCount = 0 patternFound = True print "Indexes:", split_indexes print "Split:", for arr in np.split(arr, split_indexes): print arr,',',

2条回答

网友

1楼 · 编辑于 2024-10-04 01:30:33

不确定这是否有助于提高速度，但您可以尝试使用：

np.logical_or(np.logical_or(arr[:-2], arr[1:-1]), arr[2:])

检测3个连续的0（查找False）

以及

np.logical_and(np.logical_and(arr[:-2], arr[1:-1]), arr[2:])

检测3个连续的1（寻找True）

网友

2楼 · 编辑于 2024-10-04 01:30:33

可以使用Pythran自动将代码转换为本机高效版本（显式迭代NumPy数组元素是一个性能瓶颈）。你知道吗

比如：

#pythran export pattern(bool [])
import numpy as np
def pattern(arr):
    patternFound = False
    threshold = 3
    nonzerosCount = 0
    zerosCount = 0
    split_indexes=[]

    for i in range(len(arr)):
        if patternFound:
            if arr[i] <= 0:
                zerosCount += 1
            else:
                zerosCount = 0

            if zerosCount >= threshold and i+1 != len(arr):
                zerosCount = 0
                patternFound=False
                split_indexes.append(i+1)
        else:
            if arr[i] >= 1:
                nonzerosCount += 1
            else:
                nonzerosCount = 0

            if nonzerosCount >= threshold:
                nonzerosCount = 0
                patternFound = True
    split_indexes = np.asarray(split_indexes)
    return split_indexes, np.split(arr, split_indexes)

用pythran pattern.py编译。很好用。你知道吗

没有Pythran：

% python -m timeit -s 'import pattern, numpy; arr = numpy.asarray(numpy.random.choice([0, 1], size=1000000), dtype=bool)' 'pattern.pattern(arr)'
10 loops, best of 3: 3.11 sec per loop

与Pytran：

% python -m timeit -s 'import pattern, numpy; arr = numpy.asarray(numpy.random.choice([0, 1], size=100000), dtype=bool)' 'pattern.pattern(arr)'
1000 loops, best of 3: 880 usec per loop

相关问题更多 >

编程相关推荐

热门问题

热门文章