IndexError when attempting to reference list during sort

Question

This is a bit of a strange bug in Python:

l1 = ['a', 'ab', 'cc', 'bc']
l1.sort(key=lambda x: zip(*l1)[0].count(x[0]))

The intent of this snippet is to sort elements by the frequency of their first letter. However, this produces an error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <lambda>
IndexError: list index out of range

Computing the keys separately does not give me an error, however:

sortkeys = [zip(*l1)[0].count(x[0]) for x in l1]

Where is this IndexError coming from? Surely, zip(*l1)[0] is not empty (it is equal to ('a', 'a', 'c', 'b')), and x[0] cannot be empty because all list elements are non-empty...so what's going on?

EDIT: I'm well aware this is not an ideal way to sort a list. I'm curious why this happens, not in what I should write.

creating `sortkeys` like this results in `TypeError: 'zip' object is not subscriptable` on Python 3.4. what am I doing wrong? — Pavel, Jun 12 '14 at 19:21
@Pavel - The OP is using Python 2.x, in which `zip` returns a list. — , Jun 12 '14 at 19:22
@DSM: I see the duplicates. OK, great, marked (my own question) as a duplicate :P — nneonneo, Jun 12 '14 at 19:38
I'm not sure I've ever seen anyone mark _their own_ question as a dupe... :-) — mgilson, Jun 12 '14 at 19:40

score 1 · Accepted Answer · answered Jun 12 '14 at 19:35

You can investigate this to witness the list is indeed empty while it's being sorted in place. In fact - that's very good programming imho - because it produces errors to programmers forgetting that it is sorted in place instead of producing bugs that are very hard to figure out later.

Try this:

l1 = ['a', 'ab', 'cc', 'bc']

def y(x):
    print l1
    return zip(*l1)[0].count(x[0])
y('cc')
l1.sort(key = y)

the output would be:

['a', 'ab', 'cc', 'bc']
[]

(and then the exception)

IndexError when attempting to reference list during sort

1 Answers1