<p>这是一个使用<code>numpy</code>计算每个数据项到<code>x,y</code>点的欧氏距离的解决方案,并将该项与<code>x,y</code>数据元组中距离最小的数据连接起来。你知道吗</p>
<pre><code>import numpy
import operator
# read the data into numpy arrays
testdata = numpy.genfromtxt('TestData.csv', delimiter=';', names=True)
nyDK = numpy.genfromtxt('nyDK_OVER_50M.csv', names=True, delimiter='\t',\
dtype=[('species','|S64'),\
('decimalLongitude','float32'),\
('decimalLatitude','float32')])
# extract the x,y tuples into a numpy array or [(lat,lon), ...]
xy = numpy.array(map(operator.itemgetter('x', 'y'), testdata))
# this is a function which returns a function which computes the distance
# from an arbitrary point to an origin
distance = lambda origin: lambda point: numpy.linalg.norm(point-origin)
# methods to extract the (lat, lon) from a nyDK entry
latlon = operator.itemgetter('decimalLatitude', 'decimalLongitude')
getlatlon = lambda item: numpy.array(latlon(item))
# this will transfrom a single element of the nyDK array into
# a union of it with its closest climate data
def transform(item):
# compute distance from each x,y point to this item's location
# and find the position of the minimum
idx = numpy.argmin( map(distance(getlatlon(item)), xy) )
# return the union of the item and the closest climate data
return tuple(list(item)+list(testdata[idx]))
# transform all the entries in the input data set
result = map(transform, nyDK)
print result[0:3]
</code></pre>
<p>输出:</p>
<pre><code>[('Rubus idaeus', 10.0, 56.0, 15.0, 51.0, 14.0),
('Neckera crispa', 9.8785, 56.803001, 15.300000000000001, 51.299999999999997, 2.0),
('Dicranum polysetum', 9.1919003, 56.045601, 14.6, 50.600000000000001, 10.0)]
</code></pre>
<p>注意:距离不是很近,但这可能是因为<code>.csv</code>文件中没有完整的<code>x,y</code>点网格。你知道吗</p>