在迭代数据中的所有元素时，如何过滤大型数据集中适合特定键的记录

def fillStudentList(): # TODO: Add some code here to filll # a student list pass students = fillStudentList() sameLastNames = list() for student1 in students1: students2 = fillStudentList() for student2 in students2: if student1.lastName == student2.lastName: sameLastNames.append((student1, student2))

2条回答

网友

1楼 · 编辑于 2024-06-28 11:13:19

也许这样的事情对你有用（这是O（n））

from collections import defaultdict
students = fillStudentList()
sameLastNames = defaultdict(list)
for student in students:
    sameLastNames[student.lastName].append(student)

sameLastNames = {k:v for k,v in sameLastNames.iteritems() if len(v)>1}

网友

2楼 · 编辑于 2024-06-28 11:13:19

嵌套循环是O（n**2）。您可以改为使用sort和^{}来获得O（nlogn）性能：

students = fill_student_list()
same_last_names = [list(group) for lastname, group in 
                   groupby(sorted(students, key=operator.attrgetter('lastname'))]

一般来说，您似乎在尝试做数据库支持的ORM所做的事情。不要自己动手，而是使用已经存在的众多ORMs之一。有关列表，请参见What are some good Python ORM solutions?。它们将比你自己编写的代码更加优化和强大。在

相关问题更多 >

编程相关推荐

热门问题

热门文章