映射数据帧而不是系列Pandas问题的回答

映射数据帧而不是系列Pandas

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

您可以将<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.concat.html" rel="nofollow">^{<cd1>}</a>与<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.map.html" rel="nofollow">^{<cd2>}</a>一起使用： <pre><code>print pd.concat([data2.x, data2.y, data2.Cluster, data2.Cluster.map(centers2.x.to_dict()), data2.Cluster.map(centers2.y.to_dict())], axis=1, keys=['x','y','Cluster','Centers.x','Centers.y']) x y Cluster Centers.x Centers.y 0 -0.247322 -0.699005 A 6 5 1 -0.026692 0.551841 B 1 4 2 -1.730480 -0.170510 A 6 5 3 0.814357 -0.204729 B 1 4 4 2.387925 -0.503993 C 1 0 </code></pre> 带<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.join.html" rel="nofollow">^{<cd3>}</a>的溶液：<a href="http://pandas.pydata.org/pandas-docs/stable/merging.html#joining-key-columns-on-an-index" rel="nofollow">docs</a> ^{pr2}$ 另一个带有<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.merge.html" rel="nofollow">^{<cd4>}</a>的解决方案与<code>join</code>相同，但是添加了<code>2</code>参数： <pre><code>print data2.merge(centers2, left_on='Cluster', right_index=True, suffixes=['', '_centers'], sort=False, how='left') </code></pre> 计时： <code>len(df)=5k</code>： <pre><code>data2 = pd.concat([data2]*1000).reset_index(drop=True) def root(data2, centers2): data2['Centers.x'] = data2.apply(lambda row: centers2.get_value(row['Cluster'], 'x'), axis=1) data2['Centers.y'] = data2.apply(lambda row: centers2.get_value(row['Cluster'], 'y'), axis=1) return data2 In [117]: %timeit root(data2, centers2) 1 loops, best of 3: 267 ms per loop In [118]: %timeit data2.merge(centers2, left_on='Cluster', right_index=True, suffixes=['', '_centers'], sort=False, how='left') 1000 loops, best of 3: 1.71 ms per loop In [119]: %timeit data2.join(centers2, on='Cluster', rsuffix ='_centers', sort=False, how='left') 1000 loops, best of 3: 1.71 ms per loop In [120]: %timeit pd.concat([data2.x, data2.y, data2.Cluster, data2.Cluster.map(centers2.x.to_dict()), data2.Cluster.map(centers2.y.to_dict())], axis=1, keys=['x','y','Cluster','Centers.x','Centers.y']) 100 loops, best of 3: 2.15 ms per loop In [121]: %timeit data2.merge(centers2, left_on='Cluster', right_index=True, suffixes=['', '_centers']).sort_index() 100 loops, best of 3: 2.68 ms per loop </code></pre>

映射数据帧而不是系列Pandas

1 个回答

相关Python问题