2d NumPy数组中每个元素的计数

网友

1楼 · 编辑于 2024-10-06 12:25:54

一种方法是使用numpy.unique来提取值计数。在

然后转换为字典并使用numpy.vectorize来使用这个字典映射。在

import numpy as np

A = np.array([[2,2,3,3],
              [2,3,3,3],
              [3,3,4,4]])

d = dict(zip(*np.unique(A.ravel(), return_counts=True)))

res = np.vectorize(d.get)(A)

array([[3, 3, 7, 7],
       [3, 7, 7, 7],
       [7, 7, 2, 2]], dtype=int64)

性能

我看到上面的方法对于2000x2000数组需要~2s，而通过基于字典的方法是3s。但是PaulPanzer和{a2}的纯numpy溶液仍然更快。在

^{pr2}$

网友

2楼 · 编辑于 2024-10-06 12:25:54

我相信您应该可以在这里通过使用return_inverse在np.unique()内使用return_inverse：

If True, also return the indices of the unique array (for the specified axis, if provided) that can be used to reconstruct ar.

>>> import numpy as np

>>> a = np.array([[2,2,3,3],
...               [2,3,3,3],
...               [3,3,4,4]])

>>> _, inv, cts = np.unique(a, return_inverse=True, return_counts=True)
>>> cts[inv].reshape(a.shape)

array([[3, 3, 7, 7],
       [3, 7, 7, 7],
       [7, 7, 2, 2]])

这也适用于扁平数组未排序的情况，例如b = np.array([[1, 2, 4], [4, 4, 1]])。在

网友

3楼 · 编辑于 2024-10-06 12:25:54

以下是一种利用值为int的方法：

MAX_LOOKUP = 2**24

def f_pp(a):
    mn, mx = a.min(), a.max()
    span = mx-mn+1
    if span > MAX_LOOKUP:
        raise RuntimeError('values spread to wide')
    a = a - mn
    return np.bincount(a.ravel(), None, span)[a]

时间安排（主要基于@jpp的工作）：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章

2d NumPy数组中每个元素的计数

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >