<p>你可以,下面是方法:</p>
<pre><code>def eval_s_vectorized(self, stiff):
assert len(self._qs) == self._q_count, "Run 'populate_qs' first!"
mats = np.stack([self._vfunc(*k) for k in self._qs], axis=0)
evs = np.linalg.eigvalsh(mats)
result = np.sum(np.divide(1., (stiff + evs)))
return result.real - 4 * self._q_count
</code></pre>
<p>这使得对Sympy表达式的评估仍然是未知的。这部分向量化有点棘手,主要是因为输入矩阵中的<code>1</code>。您可以通过修改<code>Solver</code>使其用<code>vmat</code>中的数组常量替换标量常量,从而生成完全矢量化的代码版本:</p>
<pre><code>import itertools as it
import numpy as np
import sympy as sp
from sympy.abc import x, y, z
from sympy.core.numbers import Number
from sympy.utilities.lambdify import implemented_function
xones = implemented_function('xones', lambda x: np.ones(len(x)))
lfuncs = {'xones': xones}
def vectorizemat(mat):
ret = mat.copy()
# get the first element of the set of symbols that mat uses
for x in mat.free_symbols: break
for i,j in it.product(*(range(s) for s in mat.shape)):
if isinstance(mat[i,j], Number):
ret[i,j] = xones(x) * mat[i,j]
return ret
class Solver:
def __init__(self, vmat):
self._vfunc = sp.lambdify((x, y, z),
expr=vectorizemat(vmat),
modules=[lfuncs, 'numpy'])
self._q_count, self._qs = None, [] # these depend on ksep!
def eval_s_vectorized_completely(self, stiff):
assert len(self._qs) == self._q_count, "Run 'populate_qs' first!"
evs = np.linalg.eigvalsh(self._vfunc(*self._qs.T).T)
result = np.sum(np.divide(1., (stiff + evs)))
return result.real - 4 * self._q_count
def populate_qs(self, ksep: float = 1.7):
self._qs = np.array([(kx, ky, kz) for kx, ky, kz
in it.product(np.arange(-3*np.pi, 3.01*np.pi, ksep),
np.arange(-3*np.pi, 3.01*np.pi, ksep),
np.arange(-3*np.pi, 3.01*np.pi, ksep))])
self._q_count = len(self._qs)
</code></pre>
<h2>测试/计时</h2>
<p>对于小型<code>ksep</code>,矢量化版本比原始版本快约2倍,完全矢量化版本快约20倍:</p>
<pre><code># old version for ksep=.3
import timeit
print(timeit.timeit("test()", setup="from __main__ import test", number=10))
-85240.46154500882
-85240.46154500882
-85240.46154500882
-85240.46154500882
-85240.46154500882
-85240.46154500882
-85240.46154500882
-85240.46154500882
-85240.46154500882
-85240.46154500882
118.42847006605007
# vectorized version for ksep=.3
import timeit
print(timeit.timeit("test()", setup="from __main__ import test", number=10))
-85240.46154498367
-85240.46154498367
-85240.46154498367
-85240.46154498367
-85240.46154498367
-85240.46154498367
-85240.46154498367
-85240.46154498367
-85240.46154498367
-85240.46154498367
64.95763925800566
# completely vectorized version for ksep=.3
import timeit
print(timeit.timeit("test()", setup="from __main__ import test", number=10))
-85240.46154498367
-85240.46154498367
-85240.46154498367
-85240.46154498367
-85240.46154498367
-85240.46154498367
-85240.46154498367
-85240.46154498367
-85240.46154498367
-85240.46154498367
5.648927717003971
</code></pre>
<p>矢量化版本的结果中的舍入误差与原始结果略有不同。这可能是由于<code>result</code>中的和的计算方式不同造成的。你知道吗</p>