回答此问题可获得 20 贡献值,回答如果被采纳可获得 50 分。
<p>我想用Python执行多项式演算。<code>numpy</code>中的<code>polynomial</code>包对我来说不够快。因此,我决定重写Fortran中的几个函数,并使用<code>f2py</code>创建可轻松导入Python的共享库。目前,我正在对照<code>numpy</code>对应的单变量和二元多项式求值例程。在</p>
<p>在单变量例程中,我使用<a href="http://en.wikipedia.org/wiki/Horner%27s_method" rel="nofollow noreferrer">Horner's method</a>,就像<code>numpy.polynomial.polynomial.polyval</code>一样。我已经观察到Fortran例程比<code>numpy</code>对应程序快的因素随着多项式阶数的增加而增加。在</p>
<p>在双变量程序中,我使用霍纳方法两次。首先在y中,然后在x中。不幸的是,我注意到对于增加多项式阶,<code>numpy</code>对应的函数会赶上并最终超过我的Fortran例程。由于<code>numpy.polynomial.polynomial.polyval2d</code>使用了与我类似的方法,我认为第二个观察结果很奇怪。在</p>
<p>我希望这个结果源于我对Fortran和<code>f2py</code>缺乏经验。有人知道为什么一元程序总是显得优越,而二元程序只对低阶多项式优越?在</p>
<p><strong>编辑</strong>
以下是我最新更新的代码、基准脚本和性能图:</p>
<p>多项式</strong></p>
<pre><code>subroutine polyval(p, x, pval, nx)
implicit none
real(8), dimension(nx), intent(in) :: p
real(8), intent(in) :: x
real(8), intent(out) :: pval
integer, intent(in) :: nx
integer :: i
pval = 0.0d0
do i = nx, 1, -1
pval = pval*x + p(i)
end do
end subroutine polyval
subroutine polyval2(p, x, y, pval, nx, ny)
implicit none
real(8), dimension(nx, ny), intent(in) :: p
real(8), intent(in) :: x, y
real(8), intent(out) :: pval
integer, intent(in) :: nx, ny
real(8) :: tmp
integer :: i, j
pval = 0.0d0
do j = ny, 1, -1
tmp = 0.0d0
do i = nx, 1, -1
tmp = tmp*x + p(i, j)
end do
pval = pval*y + tmp
end do
end subroutine polyval2
subroutine polyval3(p, x, y, z, pval, nx, ny, nz)
implicit none
real(8), dimension(nx, ny, nz), intent(in) :: p
real(8), intent(in) :: x, y, z
real(8), intent(out) :: pval
integer, intent(in) :: nx, ny, nz
real(8) :: tmp, tmp2
integer :: i, j, k
pval = 0.0d0
do k = nz, 1, -1
tmp2 = 0.0d0
do j = ny, 1, -1
tmp = 0.0d0
do i = nx, 1, -1
tmp = tmp*x + p(i, j, k)
end do
tmp2 = tmp2*y + tmp
end do
pval = pval*z + tmp2
end do
end subroutine polyval3
</code></pre>
<p><strong>基准.py</strong>(使用此脚本生成绘图)</p>
^{pr2}$
<p><strong>结果</strong>
<img src="https://i.stack.imgur.com/fLcVj.png" alt="enter image description here"/>
<img src="https://i.stack.imgur.com/nP60i.png" alt="enter image description here"/>
<img src="https://i.stack.imgur.com/15AJh.png" alt="enter image description here"/></p>
<p><strong>编辑</strong>对steabert提案的修正</p>
<pre><code>subroutine polyval(p, x, pval, nx)
implicit none
real*8, dimension(nx), intent(in) :: p
real*8, intent(in) :: x
real*8, intent(out) :: pval
integer, intent(in) :: nx
integer, parameter :: simd = 8
real*8 :: tmp(simd), xpower(simd), maxpower
integer :: i, j, k
xpower(1) = x
do i = 2, simd
xpower(i) = xpower(i-1)*x
end do
maxpower = xpower(simd)
tmp = 0.0d0
do i = nx+1, simd+2, -simd
do j = 1, simd
tmp(j) = tmp(j)*maxpower + p(i-j)*xpower(simd-j+1)
end do
end do
k = mod(nx-1, simd)
if (k == 0) then
pval = sum(tmp) + p(1)
else
pval = sum(tmp) + p(k+1)
do i = k, 1, -1
pval = pval*x + p(i)
end do
end if
end subroutine polyval
</code></pre>
<p><strong>编辑</strong>测试代码,以验证上面的代码对x>;1的结果不佳</p>
<pre><code>import polynomial as P
import numpy.polynomial.polynomial as PP
import numpy as np
for n in xrange(2,100):
poly1n = np.random.rand(n)
poly1f = np.asfortranarray(poly1n)
x = 2
print np.linalg.norm(P.polyval(poly1f, x) - PP.polyval(x, poly1n)), '\n'
</code></pre>