擅长:python、mysql、java
<p><a href="https://github.com/EelcoHoogendoorn/Numpy_arraysetops_EP" rel="nofollow noreferrer">numpy_indexed</a>包(免责声明:我是它的作者)包含高效优雅地完成这类事情的功能。这种方法的内存需求是线性的,计算要求是NlogN。对于您正在考虑的大量阵列,与当前接受的暴力方法相比,速度优势很容易达到数量级:</p>
<pre><code>import numpy as np
import numpy_indexed as npi
A = np.asarray([400.5, 100, 700, 200, 15, 900])
B = np.asarray([500.5, 200, 500, 600.5, 8, 999])
X = np.asarray([400.5, 700, 100, 300, 15, 555, 900])
Y = np.asarray([500.5, 500,600.5, 100, 8, 555, 999])
AB = np.stack([A, B], axis=-1)
XY = np.stack([X, Y], axis=-1)
# casting the AB and XY arrays to npi.index first is not required, but a performance optimization; without this each call to npi.indices would have to re-index the arrays, which is the expensive part
AB = npi.as_index(AB)
XY = npi.as_index(XY)
# npi.indices(list, items) is a vectorized nd-equivalent of list.index(item)
indAB = npi.indices(AB, XY, missing='mask').compressed()
indXY = npi.indices(XY, AB, missing='mask').compressed()
</code></pre>
<p>请注意,您也可以选择如何处理缺少的值。还要看一下set操作,比如npi交叉口(XY,AB);它们可能提供一个更简单的途径,让你在更高的层次上实现你的目标。在</p>