熔化的元组:包含Pandas的熔融柱

2024-09-27 00:21:25 发布

您现在位置:Python中文网/ 问答频道 /正文

考虑一个pandas df,其中的列包含长度相等的元组。在

L1 = [['ID1', ('key1a','key1b','key1c'), ('value1a','value1b','value1c')],
      ['ID2', ('key2a','key2b','key2c'), ('value2a','value2b','value2c')]]
df1 = pd.DataFrame(L1,columns=['ID','Key','Value'])

>>> df1
    ID                    Key                        Value
0  ID1  (key1a, key1b, key1c)  (value1a, value1b, value1c)
1  ID2  (key2a, key2b, key2c)  (value2a, value2b, value2c)

垂直展开的最简单方法是什么?公司名称:

^{pr2}$

Tags: l1id2id1value2avalue1akey1akey1bkey2a
3条回答

更快的矢量化方法是使用np.repeatnp.concatenate

In [2272]: pd.DataFrame({'ID': df1['ID'].values.repeat(df1['Key'].str.len()),
      ...:               'Key': np.concatenate(df1['Key']),
      ...:               'Value': np.concatenate(df1['Value'])})
Out[2272]:
    ID    Key    Value
0  ID1  key1a  value1a
1  ID1  key1b  value1b
2  ID1  key1c  value1c
3  ID2  key2a  value2a
4  ID2  key2b  value2b
5  ID2  key2c  value2c

计时

^{pr2}$

快速解决方案

df1.set_index('ID').stack().apply(lambda x: pd.Series(x)).unstack(0).T.reset_index()
rows = []
for _, row in df1.iterrows():
    [rows.append([row['ID'], key, val]) for key, val in zip(row['Key'], row['Value'])]

>>> pd.DataFrame(rows)
     0      1        2
0  ID1  key1a  value1a
1  ID1  key1b  value1b
2  ID1  key1c  value1c
3  ID2  key2a  value2a
4  ID2  key2b  value2b
5  ID2  key2c  value2c

计时(10k行)

^{pr2}$

相关问题 更多 >

    热门问题