<p>可以将<a href="https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.NearestNeighbors.html#sklearn.neighbors.NearestNeighbors" rel="nofollow noreferrer">sklearn.neighbors.NearestNeighbors</a>与哈弗斯线距离一起使用</p>
<pre class="lang-py prettyprint-override"><code>import pandas as pd
dfstat = pd.DataFrame({'STOP_ID': ['19970', '19971', '19972', '19973', '19974'],
'STOP_NAME': ['Royal Park Railway Station (Parkville)', 'Flemington Bridge Railway Station (North Melbo...', 'Macaulay Railway Station (North Melbourne)', 'North Melbourne Railway Station (West Melbourne)', 'Clifton Hill Railway Station (Clifton Hill)'],
'LATITUDE': ['-37.781193', '-37.788140', '-37.794267', '-37.807419', '-37.788657'],
'LONGITUDE': ['144.952301', '144.939323', '144.936166', '144.942570', '144.995417'],
'TICKETZONE': ['1', '1', '1', '1', '1'],
'ROUTEUSSP': ['Upfield', 'Upfield', 'Upfield', 'Flemington,Sunbury,Upfield,Werribee,Williamsto...', 'Mernda,Hurstbridge'],
'geometry': ['POINT (144.95230 -37.78119)', 'POINT (144.93932 -37.78814)', 'POINT (144.93617 -37.79427)', 'POINT (144.94257 -37.80742)', 'POINT (144.99542 -37.78866)']})
dfsub = pd.DataFrame({'id': ['4901', '4902', '4903', '4904', '4905'],
'postcode': ['3000', '3002', '3003', '3005', '3006'],
'suburb': ['MELBOURNE', 'EAST MELBOURNE', 'WEST MELBOURNE', 'WORLD TRADE CENTRE', 'SOUTHBANK'],
'state': ['VIC', 'VIC', 'VIC', 'VIC', 'VIC'],
'lat': ['-37.814563', '-37.816640', '-37.806255', '-37.822262', '-37.823258'],
'lon': ['144.970267', '144.987811', '144.941123', '144.954856', '144.965926']})
</code></pre>
<p>让我们首先查找数据帧中距离某个随机点最近的点,例如<code>-37.814563, 144.970267</code></p>
<pre class="lang-py prettyprint-override"><code>NN = NearestNeighbors(n_neighbors=1, metric='haversine')
NN.fit(dfstat[['LATITUDE', 'LONGITUDE']])
NN.kneighbors([[-37.814563, 144.970267]])
</code></pre>
<p>输出是<code>(array([[2.55952637]]), array([[3]]))</code>,数据帧中最近点的距离和索引。sklearn中的哈弗线距离为<a href="https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.DistanceMetric.html#sklearn.neighbors.DistanceMetric" rel="nofollow noreferrer">radius</a>。如果您想计算单位为km,可以使用<a href="https://pypi.org/project/haversine/" rel="nofollow noreferrer">haversine</a></p>
<pre class="lang-py prettyprint-override"><code>from haversine import haversine
NN = NearestNeighbors(n_neighbors=1, metric=haversine)
NN.fit(dfstat[['LATITUDE', 'LONGITUDE']])
NN.kneighbors([[-37.814563, 144.970267]])
</code></pre>
<p>输出<code>(array([[2.55952637]]), array([[3]]))</code>的距离以km为单位</p>
<p>现在,您可以应用于数据帧中的所有点,并使用索引获取最近的桩号</p>
<pre class="lang-py prettyprint-override"><code>indices = NN.kneighbors(dfsub[['lat', 'lon']])[1]
indices = [index[0] for index in indices]
distances = NN.kneighbors(dfsub[['lat', 'lon']])[0]
dfsub['closest_station'] = dfstat.iloc[indices]['STOP_NAME'].reset_index(drop=True)
dfsub['closest_station_distances'] = distances
print(dfsub)
id postcode suburb state lat lon closest_station closest_station_distances
0 4901 3000 MELBOURNE VIC -37.814563 144.970267 North Melbourne Railway Station (West Melbourne) 2.559526
1 4902 3002 EAST MELBOURNE VIC -37.816640 144.987811 Clifton Hill Railway Station (Clifton Hill) 3.182521
2 4903 3003 WEST MELBOURNE VIC -37.806255 144.941123 North Melbourne Railway Station (West Melbourne) 0.181419
3 4904 3005 WORLD TRADE CENTRE VIC -37.822262 144.954856 North Melbourne Railway Station (West Melbourne) 1.972010
4 4905 3006 SOUTHBANK VIC -37.823258 144.965926 North Melbourne Railway Station (West Melbourne) 2.703926
</code></pre>