<h2>可能的解决方案</h2>
<p>原则上,您可以使用<code>numba</code>,因为<code>multinomial</code>分布是受支持的。在</p>
<p>Numba允许您简单地用<code>numba.njit</code>装饰器装饰numpy(更重要的是标准Python函数)来显著提高性能。在</p>
<p><a href="https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html" rel="nofollow noreferrer">Check their documentation</a>以更详细地查看此方法。尤其是<code>2.7.4</code>,因为它是关于支持<code>np.random</code>(也支持多项式分布)。在</p>
<p><strong>下行</strong>:当前不支持<code>size</code>参数。您可以在嵌套循环中多次调用<code>np.random.multinomial</code>,如果用<code>numba.njit</code>修饰,它应该会更快。在</p>
<p>最后但并非最不重要的一点:您可以使用<code>numba.prange</code>和<code>parallel</code>参数将外部循环并行化到前面提到的decorator。在</p>
<h2>性能试验</h2>
<h2>第一次试验:</h2>
<ul>
<li>带类型签名的未调用numba</li>
<li>一点都不麻木</li>
</ul>
<p>测试规范:</p>
<pre><code>import sys
from functools import wraps
from time import time
import numba
import numpy as np
def timing(function):
@wraps(function)
def wrap(*args, **kwargs):
start = time()
result = function(*args, **kwargs)
end = time()
print(f"Time elapsed: {end - start}", file=sys.stderr)
return result
return wrap
@timing
@numba.njit(numba.int64(numba.int64[:, :], numba.int64))
def my_multinomial(probabilities, output):
experiments: int = 5000
output_array = []
for i in numba.prange(probabilities.shape[0]):
probability = probabilities[i] / np.sum(probabilities[i])
result = np.random.multinomial(experiments, pvals=probability)
if i % output == 0:
output_array.append(result)
return output_array[-1][-1]
if __name__ == "__main__":
np.random.seed(0)
probabilities = np.random.randint(low=1, high=100, size=(10000, 1000))
for _ in range(5):
output = my_multinomial(probabilities, np.random.randint(low=3000, high=10000))
</code></pre>
<h3>结果:</h3>
<p>带类型签名的未调用numba</p>
^{pr2}$
<p>一点都不麻木</p>
<pre><code>Time elapsed: 0.9460861682891846
Time elapsed: 0.9581060409545898
Time elapsed: 0.9654934406280518
Time elapsed: 0.9708254337310791
Time elapsed: 0.9757359027862549
</code></pre>
<p>可以看出,<code>numba</code>在这种情况下没有任何帮助(实际上它会降低性能)。对于不同大小的输入阵列,结果是一致的。在</p>
<h2>第二次试验</h2>
<ul>
<li>无类型签名的并行numba</li>
<li>一点都不麻木</li>
</ul>
<p>测试代码:</p>
<pre><code>import sys
from functools import wraps
from time import time
import numba
import numpy as np
def timing(function):
@wraps(function)
def wrap(*args, **kwargs):
start = time()
result = function(*args, **kwargs)
end = time()
print(f"Time elapsed: {end - start}", file=sys.stderr)
return result
return wrap
@timing
@numba.njit(parallel=True)
def my_multinomial(probabilities, output):
experiments: int = 5000
for i in range(probabilities.shape[0]):
probability = probabilities[i] / np.sum(probabilities[i])
result = np.random.multinomial(experiments, pvals=probability)
if i % output == 0:
print(result)
if __name__ == "__main__":
np.random.seed(0)
probabilities = np.random.randint(low=1, high=100, size=(10000, 1000))
for _ in range(5):
my_multinomial(probabilities, np.random.randint(low=3000, high=10000))
</code></pre>
<h3>结果:</h3>
<p>无类型签名的并行numba:</p>
<pre><code>Time elapsed: 1.0705969333648682
Time elapsed: 0.18749785423278809
Time elapsed: 0.1877145767211914
Time elapsed: 0.18813610076904297
Time elapsed: 0.18747472763061523
</code></pre>
<p>一点都不麻木</p>
<pre><code>Time elapsed: 1.0142333507537842
Time elapsed: 1.0311956405639648
Time elapsed: 1.022024154663086
Time elapsed: 1.0191617012023926
Time elapsed: 1.0144879817962646
</code></pre>
<h2>部分结论</h2>
<p>正如<a href="https://stackoverflow.com/users/4045774/max9111">max9111</a>在评论中正确指出的那样,我过早地得出结论。似乎并行化(如果可能的话)对您的情况是最大的帮助,而<code>numba</code>(至少在这个仍然简单且不太全面的测试中)并没有带来很大的改进。在</p>
<p>总之,您应该检查一下您的确切情况,根据经验,您使用的Python代码越多,使用<code>numba</code>可能会得到更好的结果。如果它主要是基于<code>numpy</code>的,那么你不会看到任何好处(如果有的话)。在</p>