The best answers to the question “How to sort a list of objects based on an attribute of the objects?” in the category Dev.
I’ve got a list of Python objects that I’d like to sort by an attribute of the objects themselves. The list looks like:
>>> ut [<Tag: 128>, <Tag: 2008>, <Tag: <>, <Tag: actionscript>, <Tag: addresses>, <Tag: aes>, <Tag: ajax> ...]
Each object has a count:
>>> ut.count 1L
I need to sort the list by number of counts descending.
I’ve seen several methods for this, but I’m looking for best practice in Python.
A way that can be fastest, especially if your list has a lot of records, is to use
operator.attrgetter("count"). However, this might run on an pre-operator version of Python, so it would be nice to have a fallback mechanism. You might want to do the following, then:
try: import operator except ImportError: keyfun= lambda x: x.count # use a lambda if no operator module else: keyfun= operator.attrgetter("count") # use operator since it's faster than lambda ut.sort(key=keyfun, reverse=True) # sort in-place
# To sort the list in place... ut.sort(key=lambda x: x.count, reverse=True) # To return a new list, use the sorted() built-in function... newlist = sorted(ut, key=lambda x: x.count, reverse=True)
More on sorting by keys.
It’s good practice to make object sorting logic, if applicable, a property of the class rather than incorporated in each instance the ordering is required.
This ensures consistency and removes the need for boilerplate code.
At a minimum, you should specify
__lt__ operations for this to work. Then just use
class Card(object): def __init__(self, rank, suit): self.rank = rank self.suit = suit def __eq__(self, other): return self.rank == other.rank and self.suit == other.suit def __lt__(self, other): return self.rank < other.rank hand = [Card(10, 'H'), Card(2, 'h'), Card(12, 'h'), Card(13, 'h'), Card(14, 'h')] hand_order = [c.rank for c in hand] # [10, 2, 12, 13, 14] hand_sorted = sorted(hand) hand_sorted_order = [c.rank for c in hand_sorted] # [2, 10, 12, 13, 14]
Readers should notice that the key= method:
ut.sort(key=lambda x: x.count, reverse=True)
is many times faster than adding rich comparison operators to the objects. I was surprised to read this (page 485 of “Python in a Nutshell”). You can confirm this by running tests on this little program:
#!/usr/bin/env python import random class C: def __init__(self,count): self.count = count def __cmp__(self,other): return cmp(self.count,other.count) longList = [C(random.random()) for i in xrange(1000000)] #about 6.1 secs longList2 = longList[:] longList.sort() #about 52 - 6.1 = 46 secs longList2.sort(key = lambda c: c.count) #about 9 - 6.1 = 3 secs
My, very minimal, tests show the first sort is more than 10 times slower, but the book says it is only about 5 times slower in general. The reason they say is due to the highly optimizes sort algorithm used in python (timsort).
Still, its very odd that .sort(lambda) is faster than plain old .sort(). I hope they fix that.