为什么Fortran中的一元Horner比NumPy的更快，而二元Horner是n问题的回答

为什么Fortran中的一元Horner比NumPy的更快，而二元Horner是n

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

我想用Python执行多项式演算。<code>numpy</code>中的<code>polynomial</code>包对我来说不够快。因此，我决定重写Fortran中的几个函数，并使用<code>f2py</code>创建可轻松导入Python的共享库。目前，我正在对照<code>numpy</code>对应的单变量和二元多项式求值例程。在 在单变量例程中，我使用<a href="http://en.wikipedia.org/wiki/Horner%27s_method" rel="nofollow noreferrer">Horner's method</a>，就像<code>numpy.polynomial.polynomial.polyval</code>一样。我已经观察到Fortran例程比<code>numpy</code>对应程序快的因素随着多项式阶数的增加而增加。在 在双变量程序中，我使用霍纳方法两次。首先在y中，然后在x中。不幸的是，我注意到对于增加多项式阶，<code>numpy</code>对应的函数会赶上并最终超过我的Fortran例程。由于<code>numpy.polynomial.polynomial.polyval2d</code>使用了与我类似的方法，我认为第二个观察结果很奇怪。在 我希望这个结果源于我对Fortran和<code>f2py</code>缺乏经验。有人知道为什么一元程序总是显得优越，而二元程序只对低阶多项式优越？在 编辑 以下是我最新更新的代码、基准脚本和性能图： 多项式 <pre><code>subroutine polyval(p, x, pval, nx) implicit none real(8), dimension(nx), intent(in) :: p real(8), intent(in) :: x real(8), intent(out) :: pval integer, intent(in) :: nx integer :: i pval = 0.0d0 do i = nx, 1, -1 pval = pval*x + p(i) end do end subroutine polyval subroutine polyval2(p, x, y, pval, nx, ny) implicit none real(8), dimension(nx, ny), intent(in) :: p real(8), intent(in) :: x, y real(8), intent(out) :: pval integer, intent(in) :: nx, ny real(8) :: tmp integer :: i, j pval = 0.0d0 do j = ny, 1, -1 tmp = 0.0d0 do i = nx, 1, -1 tmp = tmp*x + p(i, j) end do pval = pval*y + tmp end do end subroutine polyval2 subroutine polyval3(p, x, y, z, pval, nx, ny, nz) implicit none real(8), dimension(nx, ny, nz), intent(in) :: p real(8), intent(in) :: x, y, z real(8), intent(out) :: pval integer, intent(in) :: nx, ny, nz real(8) :: tmp, tmp2 integer :: i, j, k pval = 0.0d0 do k = nz, 1, -1 tmp2 = 0.0d0 do j = ny, 1, -1 tmp = 0.0d0 do i = nx, 1, -1 tmp = tmp*x + p(i, j, k) end do tmp2 = tmp2*y + tmp end do pval = pval*z + tmp2 end do end subroutine polyval3 </code></pre> 基准.py（使用此脚本生成绘图） ^{pr2}$ 结果 <img src="https://i.stack.imgur.com/fLcVj.png" alt="enter image description here"/> <img src="https://i.stack.imgur.com/nP60i.png" alt="enter image description here"/> <img src="https://i.stack.imgur.com/15AJh.png" alt="enter image description here"/> 编辑对steabert提案的修正 <pre><code>subroutine polyval(p, x, pval, nx) implicit none real*8, dimension(nx), intent(in) :: p real*8, intent(in) :: x real*8, intent(out) :: pval integer, intent(in) :: nx integer, parameter :: simd = 8 real*8 :: tmp(simd), xpower(simd), maxpower integer :: i, j, k xpower(1) = x do i = 2, simd xpower(i) = xpower(i-1)*x end do maxpower = xpower(simd) tmp = 0.0d0 do i = nx+1, simd+2, -simd do j = 1, simd tmp(j) = tmp(j)*maxpower + p(i-j)*xpower(simd-j+1) end do end do k = mod(nx-1, simd) if (k == 0) then pval = sum(tmp) + p(1) else pval = sum(tmp) + p(k+1) do i = k, 1, -1 pval = pval*x + p(i) end do end if end subroutine polyval </code></pre> 编辑测试代码，以验证上面的代码对x&gt；1的结果不佳 <pre><code>import polynomial as P import numpy.polynomial.polynomial as PP import numpy as np for n in xrange(2,100): poly1n = np.random.rand(n) poly1f = np.asfortranarray(poly1n) x = 2 print np.linalg.norm(P.polyval(poly1f, x) - PP.polyval(x, poly1n)), '\n' </code></pre>

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

为什么Fortran中的一元Horner比NumPy的更快，而二元Horner是n

1 个回答

相关Python问题