擅长:python、mysql、java
<p>我自己想出来的。numba无法确定<code>np.max(accmap)</code>结果的类型,即使accmap的类型设置为int。这某种程度上减慢了一切,但修复很容易:</p>
<pre><code>@autojit(locals=dict(reslen=uint))
def sum_accum(accmap, a):
reslen = np.max(accmap) + 1
res = np.zeros(reslen, dtype=a.dtype)
for i in range(len(accmap)):
res[accmap[i]] += a[i]
return res
</code></pre>
<p>结果相当令人印象深刻,大约是C版的2/3:</p>
^{pr2}$