<p><strong>方法1:使用NumPy ufunc reduceat</strong></p>
<p>我们有<a href="https://docs.scipy.org/doc/numpy/reference/ufuncs.html" rel="nofollow noreferrer">^{<cd1>}</a>来进行这三个约化操作,幸运的是,我们还有{a2}来执行这些限制在轴上的特定间隔的缩减。所以,使用这些,我们可以像这样计算这三个运算-</p>
<pre><code># Gives us sorted array based on input indices I and indices at which the
# sorted array should be interval-limited for reduceat operations to be
# applied later on using those results
def sorted_array_intervals(A, I):
# Compute sort indices for I. To be later used for sorting A based on it.
sidx = I.argsort()
sI = I[sidx]
sortedA = A[sidx]
# Get indices at which intervals change. Also, get count in each interval
idx = np.r_[ 0, np.flatnonzero(sI[1:] != sI[:-1])+1 ]
return sortedA, idx
# Groupby sum reduction using the interval indices
# to perform interval-limited ufunc reductions
def groupby_sum(A, I):
sortedA, idx = sorted_array_intervals(A,I)
return np.add.reduceat(sortedA, idx, axis=0)
# Groupby mean reduction
def groupby_mean(A, I):
sortedA, idx = sorted_array_intervals(A,I)
sums = np.add.reduceat(sortedA, idx, axis=0)
count = np.r_[idx[1:] - idx[:-1], A.shape[0] - idx[-1]]
return sums/count.astype(float)[:,None]
# Groupby max reduction
def groupby_max(A, I):
sortedA, idx = sorted_array_intervals(A,I)
return np.maximum.reduceat(sortedA, idx, axis=0)
</code></pre>
<p>因此,如果我们需要所有这些操作,我们可以对<code>sorted_array_intervals</code>的一个实例进行重用,如下-</p>
^{pr2}$
<p><strong>方法1-B:混合版本(排序+切片+减少)</strong></p>
<p>这是一个混合版本,它确实需要<code>sorted_array_intervals</code>的帮助来获得排序的数组和间隔变为下一个组的索引,但在最后一个阶段使用切片来求每个间隔的和,并对每个组进行迭代。在我们使用<code>views</code>时,切片在这里有帮助。在</p>
<p>实现应该是这样的-</p>
<pre><code>def reduce_hybrid(A, I, op="avg"):
sidx = I.argsort()
sI = I[sidx]
sortedA = A[sidx]
# Get indices at which intervals change. Also, get count in each interval
idx = np.r_[ 0, np.flatnonzero(sI[1:] != sI[:-1])+1 ]
unq_sI = sI[idx]
m = I.max()+1
N = A.shape[1]
target_dtype = (np.float if op=="avg" else A.dtype)
out = np.zeros((m,N),dtype=target_dtype)
ss_idx = np.r_[idx,A.shape[0]]
npop = {"avg": np.mean, "sum": np.sum, "max": np.max}.get(op)
for i in range(len(idx)):
out[unq_sI[i]] = npop(sortedA[ss_idx[i]:ss_idx[i+1]], axis=0)
return out
</code></pre>
<p>运行时测试(使用问题中发布的基准测试的设置)</p>
<pre><code>In [432]: d = 500000
...: f = 500
...: n = 500
...: I = np.hstack((np.arange(n), np.random.randint(n, size=(d - n,))))
...: np.random.shuffle(I)
...: A = np.random.rand(d, f)
...:
In [433]: %timeit reduce_naive(A, I, op="sum")
...: %timeit reduce_hybrid(A, I, op="sum")
...:
1 loops, best of 3: 1.03 s per loop
1 loops, best of 3: 549 ms per loop
In [434]: %timeit reduce_naive(A, I, op="avg")
...: %timeit reduce_hybrid(A, I, op="avg")
...:
1 loops, best of 3: 1.04 s per loop
1 loops, best of 3: 550 ms per loop
In [435]: %timeit reduce_naive(A, I, op="max")
...: %timeit reduce_hybrid(A, I, op="max")
...:
1 loops, best of 3: 1.14 s per loop
1 loops, best of 3: 631 ms per loop
</code></pre>
<hr/>
<p><strong>方法2:使用NumPy bincount</strong></p>
<p>这是另一种使用<a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.bincount.html" rel="nofollow noreferrer">^{<cd6>}</a>进行基于bin的求和的方法。所以,有了它,我们可以计算出总和和平均值,也可以避免在这个过程中排序,就像这样-</p>
<pre><code>ids = (I[:,None] + (I.max()+1)*np.arange(A.shape[1])).ravel()
sums = np.bincount(ids, A.ravel()).reshape(A.shape[1],-1).T
avgs = sums/np.bincount(ids).reshape(A.shape[1],-1).T
</code></pre>