我定义了下面的递归数组生成器,并使用numbajit来尝试加速处理(基于this SO answer)
@jit("float32[:](float32,float32,intp)", nopython=False, nogil=True)
def calc_func(a, b, n):
res = np.empty(n, dtype="float32")
res[0] = 0
for i in range(1, n):
res[i] = a * res[i - 1] + (1 - a) * (b ** (i - 1))
return res
a = calc_func(0.988, 0.9988, 5000)
我收到了一堆我不太明白的警告/错误。希望你能帮我解释一下,让它们消失,以便(我假设)加快计算速度。在
如下所示:
^{pr2}$NumbaWarning: Compilation is falling back to object mode WITH looplifting enabled because Function "calc_func" failed type inference due to: Invalid use of Function() with argument(s) of type(s): (int64, dtype=Literalstr) * parameterized
In definition 0: All templates rejected with literals.
In definition 1: All templates rejected without literals. This error is usually caused by passing an argument of a type that is unsupported by the named function.
[1] During: resolving callee type: Function()
[2] During: typing of call at
res = np.empty(n, dtype="float32")
File "thenameofmyscript.py", line 71:
@jit("float32:", nopython=False, nogil=True)
thenameofmyscript.py:69: NumbaWarning: Compilation is falling back to object mode WITHOUT looplifting enabled because Function "calc_func" failed type inference due to: cannot determine Numba type of
<class 'numba.dispatcher.LiftedLoop'>
File "thenameofmyscript.py", line 73:
def calc_func(a, b, n):
<source elided>
res[0] = 0
for i in range(1, n):
^
@jit("float32:", nopython=False, nogil=True)
H:\projects\decay-optimizer\venv\lib\site-packages\numba\compiler.py:742: NumbaWarning: Function "calc_func" was compiled in object mode without forceobj=True, but has lifted loops.
File "thenameofmyscript.py", line 70:
@jit("float32[:](float32,float32,intp)", nopython=False, nogil=True)
def calc_func(a, b, n):
^
self.func_ir.loc))
H:\projects\decay-optimizer\venv\lib\site-packages\numba\compiler.py:751: NumbaDeprecationWarning: Fall-back from the nopython compilation path to the object mode compilation path has been detected, this is deprecated behaviour.
File "thenameofmyscript.py", line 70:
@jit("float32[:](float32,float32,intp)", nopython=False, nogil=True)
def calc_func(a, b, n):
^
warnings.warn(errors.NumbaDeprecationWarning(msg, self.func_ir.loc))
thenameofmyscript.py:69: NumbaWarning: Code running in object mode won't allow parallel execution despite nogil=True. @jit("float32:", nopython=False, nogil=True)
1。优化函数(代数简化)
现代CPU的加法和乘法运算非常快。如有可能,应避免使用指数运算。在
示例
在这个例子中,我用一个简单的乘法代替了昂贵的求幂运算。这样的简化可以导致相当高的加速,但也可能改变结果。在
首先,您的实现(float64)没有任何签名,稍后我将在另一个简单的示例中对此进行处理。在
另外一个好主意是尽可能使用标量。在
^{pr2}$计时
2。建议签名吗?在
在提前模式(AOT)中,签名是必需的,但在通常的JIT模式下则不需要。上面的例子不是SIMD-vectoriable。所以你不会看到一个可能不是最优的输入输出声明的正面或负面影响。 让我们看另一个例子。在
为什么带有签名的版本较慢?
让我们仔细看看签名。在
如果在编译时内存布局是未知的,通常不可能对算法进行SIMD矢量化。当然,您可以显式地声明C-contigous数组,但是对于非连续的输入,该函数将不再工作,这通常不是有意的。在
相关问题 更多 >
编程相关推荐