如何对这些数据进行排序?

2024-10-03 02:46:37 发布

您现在位置:Python中文网/ 问答频道 /正文

所以,我在做一个项目,在这个项目中,我必须对一个34mb的文本文件进行排序,其中包含了大量的歌曲数据。文本文件的每一行都有年份、唯一编号、艺术家和歌曲。我搞不懂的是如何有效地将数据排序到其他文本文件中。我想按艺人姓名和歌曲名称排序。可悲的是,我只有这些:

#Opening the file to read here
with open('tracks_per_year.txt', 'r',encoding='utf8') as in_file:
#Creating 'lists' to put information from array into
years=[]
uics=[]
artists=[]
songs=[]

#Filling up the 'lists'
for line in in_file:
    year,uic,artist,song=line.split("<SEP>")
    years.append(year)
    uics.append(uic)
    artists.append(artist)
    songs.append(song)
    print(year)
    print(uic)
    print(artist)
    print(song)

#Sorting:
with open('artistsort.txt', 'w',encoding='utf8') as artist:

for x in range(1,515576):

    if artists[x]==artists[x-1]:
        artist.write (years[x])
        artist.write(" ")
        artist.write(uics[x])
        artist.write(" ")
        artist.write(artists[x])
        artist.write(" ")
        artist.write(songs[x])
        artist.write("\n")


with open('Onehitwonders.txt','w',encoding='utf8') as ohw:

for x in range(1,515576):

    if artists[x]!= artists[x-1]:
        ohw.write (years[x])
        ohw.write(" ")
        ohw.write(uics[x])
        ohw.write(" ")
        ohw.write(artists[x])
        ohw.write(" ")
        ohw.write(songs[x])
        ohw.write("\n") 

请记住我是个新手,所以请尽量用简单的语言来解释。如果你们还有别的想法,我也想听听。谢谢!你知道吗


Tags: in排序artistwithyearfilewriteprint
3条回答

你无法战胜pandas的简单性。要读取文件:

import pandas as pd

data = pd.read_csv('tracks_per_year.txt', sep='<SEP>')
data
#    year    uic     artist      song
#0   1981    uic1    artist1     song1
#1   1934    uic2    artist2     song2
#2   2004    uic3    artist3     song3

然后要按特定列排序并写入新文件,只需执行以下操作:

data.sort(columns='year').to_csv('year_sort.txt')

您可以将数据导入基于词典的结构,即针对每个艺术家和歌曲:

data = {artist_name: {song_name: {'year': year, 'uid': uid}, 
                      ... }, 
        ...}

然后在输出时,使用sorted按字母顺序获取它们:

for artist in sorted(data):
    for song in sorted(data[artist]):
        # use data[artist][song] to access details

请尝试这样的方法:

from operator import attrgetter

class Song:
    def __init__(self, year, uic, artist, song):
        self.year = year
        self.uic = uic
        self.artist = artist
        self.song = song

songs = []

with open('tracks_per_year.txt', 'r', encoding='utf8') as in_file:
    for line in in_file:
        year, uic, artist, song = line.split("<SEP>")
        songs.append(Song(year, uic, artist, song))
        print(year)
        print(uic)
        print(artist)
        print(song)

with open('artistsort.txt', 'w', encoding='utf8') as artist:
    for song in sorted(songs, key=attrgetter('artist', 'song')):
        artist.write (song.year)
        artist.write(" ")
        artist.write(song.uic)
        artist.write(" ")
        artist.write(song.artist)
        artist.write(" ")
        artist.write(song.song)
        artist.write("\n")

相关问题 更多 >