如何使用python使用特定列对csv文件中的数据进行排序

2024-06-13 12:45:56 发布

您现在位置:Python中文网/ 问答频道 /正文

我从csv文件中读取数据,并尝试使用特定列对数据进行排序,例如从csv文件中读取100名学生的数据,并根据分数对数据进行排序

import csv
import operator

with open('Student_Records.csv', 'r') as csvFile:
    reader = csv.reader(csvFile)
    for row in reader:
        print(row)
sortedlist = sorted(reader, key=operator.itemgetter(7), reverse=True)

for eachline in sortedlist:
    print(eachline)

csvFile.close()

csv文件在excel工作表中,并且该文件没有列名,下面是csv文件数据

1,Lois,Walker,F,lois.walker@hotmail.com,Donald Walker,Helen Walker,40,303-572-8492
2,Brenda,Robinson,F,brenda.robinson@gmail.com,Raymond Robinson,Judy Robinson,80,225-945-4954
3,Joe,Robinson,M,joe.robinson@gmail.com,Scott Robinson,Stephanie Robinson,70,219-904-2161
4,Diane,Evans,F,diane.evans@yahoo.com,Jason Evans,Michelle Evans,90,215-793-6791
5,Benjamin,Russell,M,benjamin.russell@charter.net,Gregory Russell,Elizabeth Russell,56,262-404-2252
6,Patrick,Bailey,M,patrick.bailey@aol.com,Ralph Bailey,Laura Bailey,36,319-812-6957
7,Nancy,Baker,F,nancy.baker@bp.com,Scott Baker,Judy Baker,78,229-336-5117

Tags: 文件csv数据csvfileimportcom排序operator
3条回答

试试熊猫

df = pd.read_csv("your_file", sep='xx', 
              names = ["x", "y", "z", "marks"])

df.sort_values('marks')

print(df)

下面应该对您有用,我在读取csv后创建了一个行列表,这样标记实际上是整数,而不是从csv读取时的字符串

另外,我假设csv中有多个空格,所以我使用了一个空格分隔符,所以itemgetter索引被选择为9,这可能会根据csv的外观有所不同

import csv
import operator

li = []

#Open csv file
with open('file.csv', 'r') as csvFile:
    reader = csv.reader(csvFile, delimiter=' ', skipinitialspace=True )

    #Create a list of all rows such that the marks column is an integer
    for item in reader:
        #Save marks value as an integer, leave other values as is
        l = [int(value) if idx == 9 else value for idx, value in enumerate(item)]
        li.append(l)

#Sort on that item
print(sorted(li, key=operator.itemgetter(9), reverse=True))

我的csv看起来像:

1   Lois    Walker  F   lois.walker@hotmail.com Donald Walker   Helen Walker    40  303-572-8492
2   Brenda  Robinson    F   brenda.robinson@gmail.com   Raymond Robinson    Judy Robinson   80  225-945-4954
3   Joe Robinson    M   joe.robinson@gmail.com  Scott Robinson  Stephanie Robinson  70  219-904-2161
4   Diane   Evans   F   diane.evans@yahoo.com   Jason Evans Michelle Evans  90  215-793-6791
5   Benjamin    Russell M   benjamin.russell@charter.net    Gregory Russell Elizabeth Russell   56  262-404-2252

输出看起来像

[['4', 'Diane', 'Evans', 'F', 'diane.evans@yahoo.com', 'Jason', 'Evans', 'Michelle', 'Evans', 90, '215-793-6791'], 
['2', 'Brenda', 'Robinson', 'F', 'brenda.robinson@gmail.com', 'Raymond', 'Robinson', 'Judy', 'Robinson', 80, '225-945-4954'], 
['3', 'Joe', 'Robinson', 'M', 'joe.robinson@gmail.com', 'Scott', 'Robinson', 'Stephanie', 'Robinson', 70, '219-904-2161'], 
['5', 'Benjamin', 'Russell', 'M', 'benjamin.russell@charter.net', 'Gregory', 'Russell', 'Elizabeth', 'Russell', 56, '262-404-2252'], 
['1', 'Lois', 'Walker', 'F', 'lois.walker@hotmail.com', 'Donald', 'Walker', 'Helen', 'Walker', 40, '303-572-8492']]

你可以试试

import csv
with open('input.csv', newline='') as csvfile:
    rdr = csv.reader(csvfile)
    l = sorted(rdr, key=lambda x: x[6], reverse=True)

csv.reader()用于创建一个读卡器对象,该读卡器对象使用sorted()reverse=True进行降序排序以获得列表。你知道吗

这个列表可以用来写出输出csv,比如

with open('output.csv', 'w') as csvout:
    wrtr = csv.writer(csvout)
    wrtr.writerows(l)

输出csv文件类似于

4,Diane   Evans,F,diane.evans@yahoo.com,Jason Evans,Michelle Evans,90,215-793-6791
2,Brenda  Robinson,F,brenda.robinson@gmail.com,Raymond Robinson,Judy Robinson,80,225-945-4954
3,Joe Robinson,M,joe.robinson@gmail.com,Scott Robinson,Stephanie Robinson,70,219-904-2161
5,Benjamin    Russell,M,benjamin.russell@charter.net,Gregory Russell,Elizabeth Russell,56,262-404-2252
1,Lois  Walker,F,lois.walker@hotmail.com,Donald Walker,Helen Walker,40,303-572-8492

因为您是从文件对象读取数据,所以将newline参数指定为''以确保安全。你知道吗

正如医生所说:

If csvfile is a file object, it should be opened with newline=''.

docs

If newline='' is not specified, newlines embedded inside quoted fields will not be interpreted correctly, and on platforms that use \r\n linendings on write an extra \r will be added. It should always be safe to specify newline='', since the csv module does its own (universal) newline handling.

相关问题 更多 >