<p>您可以将<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.concat.html" rel="nofollow">^{<cd1>}</a>与<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.map.html" rel="nofollow">^{<cd2>}</a>一起使用:</p>
<pre><code>print pd.concat([data2.x, data2.y,
data2.Cluster,
data2.Cluster.map(centers2.x.to_dict()),
data2.Cluster.map(centers2.y.to_dict())],
axis=1,
keys=['x','y','Cluster','Centers.x','Centers.y'])
x y Cluster Centers.x Centers.y
0 -0.247322 -0.699005 A 6 5
1 -0.026692 0.551841 B 1 4
2 -1.730480 -0.170510 A 6 5
3 0.814357 -0.204729 B 1 4
4 2.387925 -0.503993 C 1 0
</code></pre>
<p>带<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.join.html" rel="nofollow">^{<cd3>}</a>的溶液:<a href="http://pandas.pydata.org/pandas-docs/stable/merging.html#joining-key-columns-on-an-index" rel="nofollow">docs</a></p>
^{pr2}$
<p>另一个带有<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.merge.html" rel="nofollow">^{<cd4>}</a>的解决方案与<code>join</code>相同,但是添加了<code>2</code>参数:</p>
<pre><code>print data2.merge(centers2,
left_on='Cluster',
right_index=True,
suffixes=['', '_centers'],
sort=False,
how='left')
</code></pre>
<p><strong>计时</strong>:</p>
<p><code>len(df)=5k</code>:</p>
<pre><code>data2 = pd.concat([data2]*1000).reset_index(drop=True)
def root(data2, centers2):
data2['Centers.x'] = data2.apply(lambda row: centers2.get_value(row['Cluster'], 'x'), axis=1)
data2['Centers.y'] = data2.apply(lambda row: centers2.get_value(row['Cluster'], 'y'), axis=1)
return data2
In [117]: %timeit root(data2, centers2)
1 loops, best of 3: 267 ms per loop
In [118]: %timeit data2.merge(centers2, left_on='Cluster', right_index=True, suffixes=['', '_centers'], sort=False, how='left')
1000 loops, best of 3: 1.71 ms per loop
In [119]: %timeit data2.join(centers2, on='Cluster', rsuffix ='_centers', sort=False, how='left')
1000 loops, best of 3: 1.71 ms per loop
In [120]: %timeit pd.concat([data2.x, data2.y, data2.Cluster, data2.Cluster.map(centers2.x.to_dict()), data2.Cluster.map(centers2.y.to_dict())], axis=1, keys=['x','y','Cluster','Centers.x','Centers.y'])
100 loops, best of 3: 2.15 ms per loop
In [121]: %timeit data2.merge(centers2, left_on='Cluster', right_index=True, suffixes=['', '_centers']).sort_index()
100 loops, best of 3: 2.68 ms per loop
</code></pre>