8

I use namedtuple classes a lot. I have been thinking today if there is a nice way to implement custom sorting for such a class, i.e. make the default sort key not the first element (then second, third, etc) of the namedtuple.

My first instinct was to implement __lt__ and __eq__ and let total_ordering do the rest (it fills out le, ne, gt, ge):

from collections import namedtuple
from functools import total_ordering


@total_ordering
class B(namedtuple('B', 'x y')):
    def __lt__(self, other):
        return self.y < other.y

However:

def test_sortingB():
    b1 = B(1, 2)
    b2 = B(2, 1)
    assert b2 < b1  # passes
    assert b2 <= b1  # fails

oh, right... total_ordering only fills out the other methods if they are missing. Since tuple/namedtuple has such methods, total_ordering isn't doing anything for me.

So I guess my options are

  1. stop using namedtuple and just build my own boring class, keep using total_ordering
  2. keep using namedtuple and implement all 6 comparison methods
  3. keep using namedtuple and insert a sort value as the first field. Luckily I don't have too many instances of the class, but usually I just rely on the order of the fields to initialise them which could be nasty. Maybe that is a bad habit.

Suggestions on the best way to solve this?

pfctdayelise
  • 5,115
  • 3
  • 32
  • 52
  • Why don't you just create the namedtuple with the fields in the order you want to sort by? – BrenBarn Sep 27 '12 at 04:46
  • I didn't realise I would want to sort/max it etc until I had already created it and used it for a while. So I could add a leading field now to be the sort field but it could be a little disruptive. – pfctdayelise Sep 27 '12 at 04:48
  • 1
    But how are you using the namedtuple? The nice thing about namedtuple is it lets you access the items by name, so you could change your namedtuple to have the fields in the right order and not affect your code, as long as you access the fields by name (which presumably you're doing, or else why used namedtuple?). – BrenBarn Sep 27 '12 at 04:51
  • Yes, when accessing the fields I do use the name but not usually when initialising them (see my comment at #3 in the question). – pfctdayelise Sep 27 '12 at 04:56

3 Answers3

13

OPTION 1. Use a mixin and apply the total_ordering to that

@total_ordering
class B_ordering(object):
    __slots__ = ()                 # see Raymond's comment
    def __lt__(self, other):
        return self.y < other.y

class B(B_ordering, namedtuple('B', 'x y')):
    pass

OPTION 2. Make your own decorator based on total_ordering and just use that instead

Community
  • 1
  • 1
John La Rooy
  • 295,403
  • 53
  • 369
  • 502
  • Is there any significant difference in the overhead for option 1 vs. option 2? – Will Apr 04 '14 at 20:39
  • +1. Option 1 is a direct solution to the problem of total_ordering only filling in missing values. @ Will The overhead for Option 1 is nearly zero (one extra step in the method resolution order. @JohnLaRooy, I suggest adding \_\_slots\_\_ = () to your Option1 to restore the memory efficiency. – Raymond Hettinger Apr 10 '16 at 03:56
4

If, as your question implies, your interest is only in sorting namedtuples by an alternate key, why not use the sort/sorted key argument with the attrgetter function:

>>> from collections import namedtuple
>>> from operator import attrgetter
>>> P = namedtuple("P", "x y") 
>>> p1 = P(1, 2)
>>> p2 = P(2, 1)
>>> sorted([p1, p2], key=attrgetter("y"))
[P(x=2, y=1), P(x=1, y=2)]

You can go even further and define your own sort function:

>>> from functools import partial
>>> sortony = partial(sorted, key=attrgetter("y"))
>>> sortony([p1, p2])
[P(x=2, y=1), P(x=1, y=2)]
Don O'Donnell
  • 4,538
  • 3
  • 26
  • 27
  • The sort field doesn't exist yet, so I either need to add it or use a cmp method that looks at 2 fields and does a bit of logic about them. (My example code is overly simplified) – pfctdayelise Sep 27 '12 at 06:43
1

My advice would be to create your namedtuple with the fields in the order you want them to be sorted by. You might have to change the parts of your code where you create your values (e.g., change someTuple("name", 24) to someTuple(24, "name"), but generally values are created in fewer places than they're used in, so this shouldn't be too a big a deal. This avoids the hassle of writing all the comparison methods, and as a bonus also avoids the additional performance overhead of having those custom comparison methods called all the time.

alkanen
  • 636
  • 6
  • 16
BrenBarn
  • 242,874
  • 37
  • 412
  • 384