<p>这里是一个工作示例,以秒为单位(<;10)</p>
<p>导入库</p>
<pre><code>import pandas as pd
import numpy as np
from sklearn.neighbors import BallTree
import uuid
</code></pre>
<p>我生成一些随机数据,这也需要一秒钟,但至少我们有一些实际的数据</p>
<pre><code>np_rand_post = 5 * np.random.random((72000,2))
np_rand_post = np_rand_post + np.array((53.577653, -2.434136))
</code></pre>
<p>并将UUID用于伪造邮政编码</p>
<pre><code>postcode_df = pd.DataFrame( np_rand_post , columns=['lat', 'long'])
postcode_df['postcode'] = [uuid.uuid4().hex[:6] for _ in range(72000)]
postcode_df.head()
</code></pre>
<p>我们对空气也是这样</p>
<pre><code>np_rand = 5 * np.random.random((500,2))
np_rand = np_rand + np.array((53.55108, -2.396236))
</code></pre>
<p>再次使用uuid作为伪参考</p>
<pre><code>tube_df = pd.DataFrame( np_rand , columns=['lat', 'long'])
tube_df['ref'] = [uuid.uuid4().hex[:5] for _ in range(500)]
tube_df.head()
</code></pre>
<p>将gps值提取为numpy</p>
<pre><code>postcode_gps = postcode_df[["lat", "long"]].values
air_gps = tube_df[["lat", "long"]].values
</code></pre>
<p>创建一个棒球树</p>
<pre><code>postal_radians = np.radians(postcode_gps)
air_radians = np.radians(air_gps)
tree = BallTree(air_radians, leaf_size=15, metric='haversine')
</code></pre>
<p>查询最接近的第一个</p>
<pre><code>distance, index = tree.query(postal_radians, k=1)
</code></pre>
<p>请注意,距离不是以公里为单位,需要先进行转换</p>
<pre><code>earth_radius = 6371000
</code></pre>
<pre><code>distance_in_meters = distance * earth_radius
distance_in_meters
</code></pre>
<p>例如,使用<code>tube_df.ref[ index[:,0] ]</code>获取ref</p>