2

I want to sort a list based on a second list as in this stack overflow post. In my case, my code is trying to sort Chromo objects by their corresponding fitness_weights, so I tried the solutions on the linked post:

def foobar():
    ...
    chromolist = [x for _, x in sorted(zip(fitness_weights, chromolist))]
    ...

Giving the error:

TypeError: '<' not supported between instances of 'Chromo' and 'Chromo'

To debug I tried:

def foobar():
    ...
    try:
        chromolist = [x for _, x in sorted(zip(fitness_weights, chromolist))]
    except Exception as e:
        print(fitness_weights)
        print(chromolist)
        print([i for i in zip(fitness_weights, chromolist)])
        raise e
    print('works fine')
    ...

Outputing:

works fine
works fine
works fine
works fine
works fine
works fine
works fine
works fine
works fine
works fine
works fine
works fine
works fine
works fine
works fine
works fine
works fine
works fine
works fine
works fine
works fine
works fine
[2630793242, 2662598634, 1204127226, 1218205610, 1224753838, 1212750850, 
1212293610, 1221507266, 1269226518, 1363578674, 1209661338, 2674408754, 
1179213986, 1209887778, 2281636710, 1906925334, 1156258126, 1287144442, 
1218205610, 1256241498, 2926198286, 1533442630, 1587421406, 2685579290, 
1203563674, 1205066274, 1181576990, 1188462746, 1127834446, 2295554650, 
1216261042, 1193222146, 1191591394, 1206052810, 1206800842, 1213410890, 
1202786310, 1230097202, 1277296358, 1218982810]

[Chromo Object, Chromo Object, Chromo Object, Chromo Object, Chromo Object, 
Chromo Object, Chromo Object, Chromo Object, Chromo Object, Chromo Object, 
Chromo Object, Chromo Object, Chromo Object, Chromo Object, Chromo Object, 
Chromo Object, Chromo Object, Chromo Object, Chromo Object, Chromo Object, 
Chromo Object, Chromo Object, Chromo Object, Chromo Object, Chromo Object, 
Chromo Object, Chromo Object, Chromo Object, Chromo Object, Chromo Object, 
Chromo Object, Chromo Object, Chromo Object, Chromo Object, Chromo Object, 
Chromo Object, Chromo Object, Chromo Object, Chromo Object, Chromo Object]

[(2630793242, Chromo Object), (2662598634, Chromo Object), 
(1204127226, Chromo Object), (1218205610, Chromo Object),
(1224753838, Chromo Object), (1212750850, Chromo Object),
(1212293610, Chrom Object), (1221507266, Chromo Object), 
(1269226518, Chromo Object), (1363578674, Chromo Object), 
(1209661338, Chromo  Object), (2674408754, Chromo Object),
(1179213986, Chromo Object), (1209887778, Chromo Object), 
(2281636710, Chromo Object), (1906925334, Chromo Object), 
(1156258126, Chromo Object), (1287144442, Chromo Object), 
(1218205610, Chromo  Object), (1256241498, Chromo Object),
(2926198286, Chromo Object), (1533442630, Chromo Object), 
(1587421406, Chromo Object), (2685579290, Chromo Object), 
(1203563674, Chromo Object), (1205066274, Chromo Object), 
(1181576990, Chromo  Object), (1188462746, Chromo Object), 
(1127834446, Chromo Object), (2295554650, Chromo Object),
(1216261042, Chromo Object), (1193222146, Chromo Object), 
(1191591394, Chromo Object), (1206052810, Chromo Object), 
(1206800842, Chromo  Object), (1213410890, Chromo Object), 
(1202786310, Chromo Object), (1230097202, Chromo Object), 
(1277296358, Chromo Object), (1218982810, Chromo Object)]

Which is confusing because:

  • All data types are correct
  • The function worked correctly 22 times

How do I fix this?

Krish
  • 1,044
  • 9
  • 20

1 Answers1

8

The error occurs when you have two (weight, chromo) pairs with an equal weight, at which point Python tries to compare the chromo values:

>>> class Chromo:
...     pass
...
>>> chromolist = [Chromo(), Chromo()]
>>> fitness_weights = [42, 42]
>>> sorted(zip(fitness_weights, chromolist))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: '<' not supported between instances of 'Chromo' and 'Chromo'

You can avoid this issue either by using a custom sort key that only extracts the weight, or by adding a tie breaker that is unique for every value in the sorted sequence, like a counter:

from itertools import count

chromolist = [x for *_, x in sorted(zip(fitness_weights, count(), chromolist))]

The counter is only there to make sure Python never looks at the Chromo instances, as now each element is a (weight, unique_integer, Chromo) tuple:

>>> from itertools import count
>>> sorted(zip(fitness_weights, count(), chromolist))
[(42, 0, <__main__.Chromo object at 0x1038cfa00>), (42, 1, <__main__.Chromo object at 0x103a396d0>)]

A sort key is just a function that produces the values to compare, you could use a lambda (lambda t: t[0]), or the operator.itemgetter() object:

from operator import itemgetter

chromolist = [x for _, x in sorted(zip(fitness_weights, chromolist), key=itemgetter(0))]

The key function requires a separate pass over the input list, so is ever so slightly slower, as seen in this simple time trial with 200 inputs:

>>> from timeit import timeit
>>> fw = fitness_weights * 100
>>> cl = chromolist * 100
>>> timeit('sorted(zip(fw, count(), cl))', globals=globals(), number=100000)
1.4618491119981627
>>> timeit('sorted(zip(fw, cl), key=itemgetter(0))', globals=globals(), number=100000)
1.6409574589997646
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Solved! Thanks a million for that – Krish May 06 '20 at 22:01
  • Neat! Didn't know about how `count` works with `zip` as a sequence. TIL :) – r.ook May 06 '20 at 22:05
  • Another alternative that perhaps has secondary benefits is just to implement the less than `__lt__` comparator in the Chromo class and that fixes it and allows independent sorting of the Chromo's if needed – AirSquid May 06 '20 at 22:15