在使用numpy unique计数时，避免使用python for cycle来提高性能

2024-09-28 12:14:22 发布

男 | 程序猿一只，喜欢编程写python代码。

我有两个numpy数组，A具有形状(N,3)和B具有形状(N,)，我从向量A生成具有唯一项的向量，例如：

A = np.array([[1.,2.,3.],
              [4.,5.,6.],
              [1.,2.,3.],
              [7.,8.,9.]])

B = np.array([10.,33.,15.,17.])

AUnique, directInd, inverseInd, counts = np.unique(A, 
                                             return_index = True, 
                                             return_inverse = True, 
                                             return_counts = True, 
                                             axis = 0)

所以AUnique将是array([[1., 2., 3.], [4., 5., 6.], [7., 8., 9.]])

然后，我获得与AUnique相关联的simil向量B，对于A中的每个非唯一行，我求和该向量中B的相关值，即：

BNew = B[directInd] 

# here BNew is [10., 33.,17]

for Id in np.asarray(counts>1).nonzero()[0]: 
  BNew[Id] = np.sum(B[inverseInd == Id])

# here BNew is [25., 33.,17]

问题是，对于大N个向量（数百万行或数千万行），for循环变得非常慢，我想知道是否有办法避免循环和/或使代码更快

提前谢谢

Tags： id true for return here is np array

1条回答

网友

1楼 · 发布于 2024-09-28 12:14:22

我想你可以用np.bincount做你想做的事

BNew = np.bincount(inverseInd, weights = B)
BNew

Out[]: array([25., 33., 17.])

在使用numpy unique计数时，避免使用python for cycle来提高性能

相关问题更多 >

编程相关推荐

热门问题

热门文章

在使用numpy unique计数时，避免使用python for cycle来提高性能

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >