<p>如果您的数据是在网格坐标中,那么这种方法会稍微精简一些,但只需一键即可</p>
<p>以<a href="https://stackoverflow.com/a/63636266/9249533">sutan's answer</a>为基础,精简赫尔辛基大学的街区</p>
<p>要获得多个邻居,您需要编辑k_neights参数……并且还必须在函数体中硬编码变量(请参见下面的“最近”和“最近距离”)并将它们添加到return语句中</p>
<p>因此,如果您想要两个最近的点,它看起来像:</p>
<pre><code>from sklearn.neighbors import BallTree
import numpy as np
def get_nearest(src_points, candidates, k_neighbors=2):
"""
Find nearest neighbors for all source points from a set of candidate points
modified from: https://automating-gis-processes.github.io/site/notebooks/L3/nearest-neighbor-faster.html
"""
# Create tree from the candidate points
tree = BallTree(candidates, leaf_size=15, metric='euclidean')
# Find closest points and distances
distances, indices = tree.query(src_points, k=k_neighbors)
# Transpose to get distances and indices into arrays
distances = distances.transpose()
indices = indices.transpose()
# Get closest indices and distances (i.e. array at index 0)
# note: for the second closest points, you would take index 1, etc.
closest = indices[0]
closest_dist = distances[0]
closest_second = indices[1] # *manually add per comment above*
closest_second_dist = distances[1] # *manually add per comment above*
# Return indices and distances
return (closest, closest_dist, closest_sec, closest_sec_dist)
</code></pre>
<p>输入是(x,y)元组的列表。因此,由于(通过问题标题)您的数据位于GeoDataframe中:</p>
<pre><code># easier to read
in_pts = [(row.geometry.x, row.geometry.y) for idx, row in gdf1.iterrows()]
qry_pts = [(row.geometry.x, row.geometry.y) for idx, row in gdf2.iterrows()]
# faster (by about 7X)
in_pts = [(x,y) for x,y in zip(gdf1.geometry.x , gdf1.geometry.y)]
qry_pts = [(x,y) for x,y in zip(gdf2.geometry.x , gdf2.geometry.y)]
</code></pre>
<p>我对距离不感兴趣,因此我不在函数外添加注释,而是运行:</p>
<pre><code>idx_nearest, _, idx_2ndnearest, _ = get_nearest(in_pts, qry_pts)
</code></pre>
<p>并获得两个长度相同的in_pts数组,分别包含qry_pts原始地理数据框中最近点和第二最近点的索引值</p>