Pandas dataframe：如何根据标记的列valu将一行转换为单独的行

sentiment text 0 USD:1,CNY:-1 US economy is improving while China is struggling 1 USD:-1, JPY:1 Unemployment is high for US while low for Japan

3条回答

网友

1楼 · 编辑于 2024-09-30 08:22:53

您还可以尝试通过在','上拆分并使用melt选项展开行来扩展情感。在

df1 = df1.merge(df1.sentiment.str.split(',',expand=True),left_index=True,right_index=True,how='outer')
df1.drop('sentiment',axis=1,inplace=True)
df1 = df1.melt('text')
df1[['currency','sentiment']] = df1.value.str.split(':',expand=True)
df1.drop(['variable','value'],axis=1,inplace=True)

输出：

^{pr2}$

网友

2楼 · 编辑于 2024-09-30 08:22:53

您可以构建一个新的数据帧，根据需要链接和重复值。在

import numpy as np
from itertools import chain

df = pd.DataFrame({'sentiment': ['USD:1,CNY:-1', 'USD:-1, JPY:1'],
                   'text': ['US economy is improving while China is struggling',
                            'Unemployment is high for US while low for Japan']})

# remove whitespace and split by ','
df['sentiment'] = df['sentiment'].str.replace(' ', '').str.split(',')

# construct expanded dataframe
res = pd.DataFrame({'sentiment': list(chain.from_iterable(df['sentiment'])),
                    'text': np.repeat(df['text'], df['sentiment'].map(len))})

# split sentiment series into currency and value components
res[['currency', 'sentiment']] = res.pop('sentiment').str.split(':', expand=True)
res['sentiment'] = res['sentiment'].astype(int)

结果：

^{pr2}$

网友

3楼 · 编辑于 2024-09-30 08:22:53

您可以在,|:上拆分sentiment列，然后展开^{}

然后使用^{}&；^{}根据split的len重复text列。在

# Split the col on both , and : then stack.
s = df['sentiment'].str.split(',|:',expand=True).stack()

# Reindex and repeat cols on len of split and reset index.
df1 = df.reindex(df.index.repeat(df['sentiment'].fillna("").str.split(',').apply(len))) 
df1 = df1.reset_index(drop=True)

df1['currency'] = s[::2].reset_index(drop=True)
df1['sentiment'] = s[1::2].reset_index(drop=True)

print (df1.sort_index(axis=1))

输出：

^{pr2}$

输出：

相关问题更多 >

编程相关推荐

热门问题

热门文章

Pandas dataframe：如何根据标记的列valu将一行转换为单独的行

输出：

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >