将行中的值拆分为多列

2024-10-03 13:21:02 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个csv文件,它的第一列包含曲目名、艺术家名和其他信息。我想将第一列值拆分为两个不同的列。下面是csv文件的示例

artist_trackname,year,month,day,hour,minute
'Sonic Species & Volcano - What Is Life\n',2020,8,5,0,25

我想要实现的是:

artist,trackname,year,month,day,hour,minute
'Sonic Species & Volcano, What Is Life\n',2020,8,5,0,25

有人能帮我用python完成吗


Tags: 文件csvartistisyearwhatspeciessonic
2条回答

如果这些字段总是用连字符分隔,那么只需使用python终端读取文件,将“-”替换为“,”即可。下面是一个假设您的CSV文件名为test.CSV的示例:

>>> with open('test.csv', 'r') as f:
...     lines = f.readlines()
... 
>>> lines
['artist_trackname,year,month,day,hour,minute\n', "'Sonic Species & Volcano - What Is Life\\n',2020,8,5,0,25\n"]
>>> write_lines = [line.replace(" - ", ",") for line in lines]
>>> with open('test.csv', 'w') as f:
...     f.writelines(write_lines)
... 

您似乎想要一个仅基于artist_trackname的“-”部分的附加列

这在熊猫身上很容易做到

import pandas as pd

加载您的csv:

df = pd.read_csv(r"./filename.csv")

df.head()


artist_trackname    year    month   day hour    minute
0   'Sonic Species & Volcano - What Is Life\n'  2020    8   5   0   25

将项目从“-”中拆分为两列:

df[['artist','trackname']] = df['artist_trackname'].str.split(" - ", n = 1, expand = True)

删除旧列:

df.drop(columns=["artist_trackname"], inplace=True)

将列重新排序为所需格式:

df[['artist','trackname','year','month','day','hour','minute']]

df.head()

    artist  trackname   year    month   day hour    minute
0   'Sonic Species & Volcano    What Is Life\n' 2020    8   5   0   25

回写到csv:

df.to_csv(r"/path/to/filename.csv")

相关问题 更多 >