计算交点时避免numpy循环

2024-06-28 20:48:11 发布

您现在位置:Python中文网/ 问答频道 /正文

我想加速以下处理r光线和n球体的计算。到目前为止,我得到的是:

# shape of mu1 and mu2 is (r, n)
# shape of rays is (r, 3)
# note that intersections has 2n columns because for every sphere one can
# get up to two intersections (secant, tangent, no intersection)
intersections = np.empty((r, 2*n, 3))
for col in range(n):
    intersections[:, col, :] = rays * mu1[:, col][:, np.newaxis]
    intersections[:, col + n, :] = rays * mu2[:, col][:, np.newaxis]

# [...]

# calculate euclidean distance from the center of gravity (0,0,0)
distances = np.empty((r, 2 * n))
for col in range(n):
    distances[:, col] = np.linalg.norm(intersections[:, col], axis=1)
    distances[:, col + n] = np.linalg.norm(intersections[:, col + n], axis=1)

我试图通过避免for循环来加快速度,但却不知道如何正确地广播数组,所以我只需要一个函数调用。非常感谢您的帮助。你知道吗


Tags: ofinforisnprangecolempty
1条回答
网友
1楼 · 发布于 2024-06-28 20:48:11

下面是使用^{}的矢量化方法-

intersections = np.hstack((mu1,mu2))[...,None]*rays[:,None,:]
distances = np.sqrt((intersections**2).sum(2))

最后一步可以用^{}来代替,就像这样-

distances = np.sqrt(np.einsum('ijk,ijk->ij',intersections,intersections))

或者用np.einsum替换几乎所有的东西,换成另一种矢量化的方式,就像这样-

mu = np.hstack((mu1,mu2))
distances = np.sqrt(np.einsum('ij,ij,ik,ik->ij',mu,mu,rays,rays))

运行时测试和验证输出-

def original_app(mu1,mu2,rays):
    intersections = np.empty((r, 2*n, 3))
    for col in range(n):
        intersections[:, col, :] = rays * mu1[:, col][:, np.newaxis]
        intersections[:, col + n, :] = rays * mu2[:, col][:, np.newaxis]

    distances = np.empty((r, 2 * n))
    for col in range(n):
        distances[:, col] = np.linalg.norm(intersections[:, col], axis=1)
        distances[:, col + n] = np.linalg.norm(intersections[:, col + n], axis=1)
    return distances                    

def vectorized_app1(mu1,mu2,rays):
    intersections = np.hstack((mu1,mu2))[...,None]*rays[:,None,:]
    return np.sqrt((intersections**2).sum(2))

def vectorized_app2(mu1,mu2,rays):
    intersections = np.hstack((mu1,mu2))[...,None]*rays[:,None,:]
    return np.sqrt(np.einsum('ijk,ijk->ij',intersections,intersections))

def vectorized_app3(mu1,mu2,rays):
    mu = np.hstack((mu1,mu2))
    return np.sqrt(np.einsum('ij,ij,ik,ik->ij',mu,mu,rays,rays))

计时-

In [101]: # Inputs
     ...: r = 1000
     ...: n = 1000
     ...: mu1 = np.random.rand(r, n)
     ...: mu2 = np.random.rand(r, n)
     ...: rays = np.random.rand(r, 3)


In [102]: np.allclose(original_app(mu1,mu2,rays),vectorized_app1(mu1,mu2,rays))
Out[102]: True

In [103]: np.allclose(original_app(mu1,mu2,rays),vectorized_app2(mu1,mu2,rays))
Out[103]: True

In [104]: np.allclose(original_app(mu1,mu2,rays),vectorized_app3(mu1,mu2,rays))
Out[104]: True

In [105]: %timeit original_app(mu1,mu2,rays)
     ...: %timeit vectorized_app1(mu1,mu2,rays)
     ...: %timeit vectorized_app2(mu1,mu2,rays)
     ...: %timeit vectorized_app3(mu1,mu2,rays)
     ...: 
1 loops, best of 3: 306 ms per loop
1 loops, best of 3: 215 ms per loop
10 loops, best of 3: 140 ms per loop
10 loops, best of 3: 136 ms per loop

相关问题 更多 >