Python sorting list of tuples by frequency

Question

I want to sort list of tuples like that:

rows = [ ('A', 'a', 1, '?'),
     ('A', 'a', 1, '!'),
     ('A', 'a', 1, '#'),
     ('A', 'b', 1, '#'),
     ('A', 'b', 2, '$'),
     ('A', 'c', 2, '@'),
     ('A', 'd', 3, '@') ]

by this frequency pattern:

- we have 1 value 'A' at index [0]
- we have 4 values 'a', 'b', 'c', 'd' at index [1]
- we have 3 values 1,2,3 at index [2]
- we have 5 values '?', '!', '#', '$', '@' at index[3]

so, sorted list should look like that:

rows = [ ('A', 1, 'a', '?'),
     ('A', 1, 'a', '!'),
     ('A', 1, 'a', '#'),
     ('A', 1, 'b', '#'),
     ('A', 2, 'b', '$'),
     ('A', 2, 'c', '@'),
     ('A', 3, 'd', '@') ]

How to do that elegantly?

You are also not explaining yourself very well. Are you sorting the **columns** of the list of tuples here? — Martijn Pieters, Feb 12 '15 at 15:02

score 1 · Accepted Answer · edited May 23 '17 at 11:49

Transpose your rows to columns, sort by their set length (unique count), then transpose again:

zip(*sorted(zip(*rows), key=lambda c: len(set(c))))

zip(*nested_list) returns the columns of all the rows in nested_list, provided those rows are all the same length (if any list is shorter than the others the remaining columns are ignored).

This will move the second column up to the left as it has more unique values.

Demo:

>>> rows = [ ('A', 'a', 1, '?'),
...      ('A', 'a', 1, '!'),
...      ('A', 'a', 1, '#'),
...      ('A', 'b', 1, '#'),
...      ('A', 'b', 2, '$'),
...      ('A', 'c', 2, '@'),
...      ('A', 'd', 3, '@') ]
>>> zip(*sorted(zip(*rows), key=lambda c: len(set(c))))
[('A', 1, 'a', '?'), ('A', 1, 'a', '!'), ('A', 1, 'a', '#'), ('A', 1, 'b', '#'), ('A', 2, 'b', '$'), ('A', 2, 'c', '@'), ('A', 3, 'd', '@')]
>>> from pprint import pprint
>>> pprint(_)
[('A', 1, 'a', '?'),
 ('A', 1, 'a', '!'),
 ('A', 1, 'a', '#'),
 ('A', 1, 'b', '#'),
 ('A', 2, 'b', '$'),
 ('A', 2, 'c', '@'),
 ('A', 3, 'd', '@')]

@MartijnPieters For some of us, could you please explain the answer a bit more. Unlike OP, I am a bit curious about your comprehension — ha9u63a7, Feb 12 '15 at 15:25
@ha9u63ar: there is no comprehension here. :-) Or where you not talking about a list comprehension? — Martijn Pieters, Feb 12 '15 at 15:34

ely · Answer 2 · 2015-02-12T15:24:32.480

If you are willing / interested to do it via the pandas library, see below. I personally vote that this is strictly inferior to using a zip and sorted (with key) solution, or possibly something with collections.Counter, but it's here nonetheless.

df = pandas.DataFrame(rows).sort([0, 1, 2], ascending=(1, 1, 1))
col_order = df.apply(lambda x: x.nunique()).argsort().values.tolist()
map(tuple, df[col_order].values.tolist())

E.g.:

In [30]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:df = pandas.DataFrame(rows).sort([0, 1, 2], ascending=(1, 1, 1))
:col_order = df.apply(lambda x: x.nunique()).argsort().values.tolist()
:map(tuple, df[col_order].values.tolist())
:--
Out[30]: 
[('A', 1, 'a', '?'),
 ('A', 1, 'a', '!'),
 ('A', 1, 'a', '#'),
 ('A', 1, 'b', '#'),
 ('A', 2, 'b', '$'),
 ('A', 2, 'c', '@'),
 ('A', 3, 'd', '@')]

Python sorting list of tuples by frequency

2 Answers2