<p>我们可以更聪明地编制索引,节省大约4倍的成本。在</p>
<p>首先,让我们构建一些正确形状的数据:</p>
<pre><code>seed = np.random.randint(0, 100, (200,206))
data = np.random.randint(0, 100, (4e5,206))
seed[:, 0] = np.arange(200)
data[:, 0] = np.random.randint(0, 200, 4e5)
diam = np.empty(200)
</code></pre>
<p>原答案时间:</p>
^{pr2}$
<p>莫宁森的回答是:</p>
<pre><code>%%timeit
seed_repeated = seed[data[:,0]]
dist_to_center = np.sqrt(np.sum((data[:,1:]-seed_repeated[:,1:])**2, axis=1))
diam = np.zeros(len(seed))
np.maximum.at(diam, data[:,0], dist_to_center)
1 loops, best of 3: 1.33 s per loop
</code></pre>
<p>分析者的回答是:</p>
<pre><code>%%timeit
data_sorted = data[data[:, 0].argsort()]
seed_ext = np.repeat(seed,np.bincount(data_sorted[:,0]),axis=0)
dists = np.sqrt(((data_sorted[:,1:] - seed_ext[:,1:])**2).sum(1))
shift_idx = np.append(0,np.nonzero(np.diff(data_sorted[:,0]))[0]+1)
diam_out = np.maximum.reduceat(dists,shift_idx)
1 loops, best of 3: 1.65 s per loop
</code></pre>
<p>正如我们所看到的,除了更大的内存占用外,矢量化解决方案并没有真正获得任何好处。为了避免这种情况,我们需要回到原来的答案,这是做这些事情的正确方法,并尝试减少索引的数量:</p>
<pre><code>%%timeit
idx = data[:,0].argsort()
bins = np.bincount(data[:,0])
counter = 0
for i in range(200):
data_slice = idx[counter: counter+bins[i]]
diam[i] = spd.cdist(seed[None, i, 1:], data[data_slice, 1:]).max()
counter += bins[i]
1 loops, best of 3: 281 ms per loop
</code></pre>
<p>仔细检查答案:</p>
<pre><code>np.allclose(diam, dam_out)
True
</code></pre>
<p>这就是假设python循环不好的问题。它们通常是,但并非在所有情况下。在</p>