34

I have a data-structure which is something like this:

The population of three cities for different year are as follows.

Name  1990 2000 2010
A     10   20   30
B     20   30   10
C     30   10   20

I am using a defaultdict to store the data.

from collections import defaultdict
cityPopulation=defaultdict(list)
cityPopulation['A']=[10,20,30]
cityPopulation['B']=[20,30,10]
cityPopulation['C']=[30,10,20]

I want to sort the defaultdict based on a particular column of the list (the year). Say, sorting for 1990, should give C,B,A, while sorting for 2010 should give A,C,B.

Also, is this the best way to store the data? As I am changing the population values, I want it to be mutable.

imsc
  • 7,492
  • 7
  • 47
  • 69
  • Do you want to store the sorted dictionary for future use, or just output it? You might want to look in to ordereddict and namedtuple. – Silas Ray Apr 17 '12 at 16:01
  • I just want to print the order. – imsc Apr 17 '12 at 16:02
  • Why does sorting for `2010` give `A,B,C`? – jamylak Apr 17 '12 at 16:04
  • Well, you are still most likely going to want a data structure that is ordered. – Silas Ray Apr 17 '12 at 16:04
  • I do not want the data-structure to be ordered, as the order will depend on the year. – imsc Apr 17 '12 at 16:09
  • But you can't sort something that is inherently unordered. It's like trying to use a sieve to hold water; it just can't do it. Even if the ordered object is only transient, you can't have the data sorted without, well, sorting it first, and you can't sort it without an ordered data structure. – Silas Ray Apr 17 '12 at 16:11
  • @sr2222: If you only want to do one operation with the sorted data, why bother storing the sorted order? Just use it sorted then forget. – Gareth Latty Apr 17 '12 at 17:18
  • @Lattyware the nature of sorting requires that you store the data ordered at least for the length of time you need it sorted (unless you want an incredibly clunky and slow implementation). Like I said, it could be a transient object, but it does have to exist at least for a while. – Silas Ray Apr 17 '12 at 17:38
  • @sr2222 Well, no. You can use ``sorted()`` to produce a generator. You don't store the values as they are lazily generated from the ``defaultdict``. – Gareth Latty Apr 17 '12 at 17:48
  • @Lattyware http://stackoverflow.com/questions/4154571/sorted-using-generator-expressions-rather-than-lists Please detail how you could produce any kind of sort algorithm that ran with anything that could even be loosely affiliated with efficiency that would not have to create an object that contained all the data in various states of sortedness? Start with this list http://en.wikipedia.org/wiki/Sorting_algorithm and tell me which work without storing some kind of interim sorted list or object. – Silas Ray Apr 17 '12 at 18:00
  • @sr2222 I'm not saying Python doesn't do it, I'm saying *you* don't need to. ``sorted()`` returns a sorted list of items, so you just iterated over it and forget about it. – Gareth Latty Apr 17 '12 at 18:05
  • @Lattyware It's not the best idea to program thinking, "if the language/library does it for me, I don't need to know what's going on." That mindset leads to inefficient code and more often than not bugs that you can't fix. – Silas Ray Apr 17 '12 at 18:11

4 Answers4

42
>>> sorted(cityPopulation.iteritems(),key=lambda (k,v): v[0],reverse=True) #1990
[('C', [30, 10, 20]), ('B', [20, 30, 10]), ('A', [10, 20, 30])]
>>> sorted(cityPopulation.iteritems(),key=lambda (k,v): v[2],reverse=True) #2010
[('A', [10, 20, 30]), ('C', [30, 10, 20]), ('B', [20, 30, 10])]

Note in python 3 you can't automagically unpack lambda arguments so you would have to change the code

sorted(cityPopulation.items(), key=lambda k_v: k_v[1][2], reverse=True) #2010
grrrrrr
  • 1,395
  • 12
  • 29
jamylak
  • 128,818
  • 30
  • 231
  • 230
  • Thanks a lot. This is very close to what I want. Is there a way to call or name the columns by the year they represent? – imsc Apr 17 '12 at 22:01
  • If I have the above example data, how should I store it to achieve this? In the actual data-set the number of columns (years) are of the order of 100. Thanks. – imsc Apr 18 '12 at 06:24
  • I'm not sure of the best way to achieve it. – jamylak Apr 18 '12 at 06:27
  • I've been staring at Lambdas for months, and your obvious and simple example finally helped me understand them. Thanks! – JayCrossler Jul 17 '14 at 17:43
24

If you want to sort based on the values, not in the keys, use data.items() and set the key with lambda kv: kv[1] so that it picks the value.


See an example with this defaultdict:

>>> from collections import defaultdict
>>> data = defaultdict(int)
>>> data['ciao'] = 17
>>> data['bye'] = 14
>>> data['hello'] = 23

>>> data
defaultdict(<type 'int'>, {'ciao': 17, 'bye': 14, 'hello': 23})

Now, let's sort by value:

>>> sorted(data.items(), lambda kv: kv[1])
[('bye', 14), ('ciao', 17), ('hello', 23)]

Finally use reverse=True if you want the bigger numbers to come first:

>>> sorted(data.items(), lambda kv: kv[1], reverse=True)
[('hello', 23), ('ciao', 17), ('bye', 14)]

Note that key=lambda(k,v): v is a clearer (to me) way to say key=lambda(v): v[1], only that the later is the only way Python 3 allows it, since auto tuple unpacking in lambda is not available.

In Python 2 you could say:

>>> sorted(d.items(), key=lambda(k,v): v)
[('bye', 14), ('ciao', 17), ('hello', 23)]
fedorqui
  • 275,237
  • 103
  • 548
  • 598
12

A defaultdict doesn't hold order. You might need to use a OrderedDict, or sort the keys each time as a list.

E.g:

  from operator import itemgetter
  sorted_city_pop = OrderedDict(sorted(cityPopulation.items()))

Edit: If you just want to print the order, simply use the sorted builtin:

for key, value in sorted(cityPopulation.items()):
    print(key, value)
Jungle Hunter
  • 7,233
  • 11
  • 42
  • 67
Gareth Latty
  • 86,389
  • 17
  • 178
  • 183
  • 1
    I do not want to store the order, just print it. – imsc Apr 17 '12 at 16:03
  • @sberry If you mean the extra ``key`` argument, I just removed it - there is indeed no need for it given tuples are sorted first item first and keys are guaranteed to be unique. – Gareth Latty Apr 17 '12 at 16:05
  • As the values in the dictionary are list, How can I sort by different columns? – imsc Apr 17 '12 at 16:11
  • 1
    What sberry meant is that you are shadowing the built-in named `sorted()`. The line `sorted = OrderedDict(sorted(cityPopulation.items())` will only work once. – Sven Marnach Apr 17 '12 at 16:12
  • @imsc pass a lambda as the key parameter to the sorted method that will return the item from the list you want from each city. – Silas Ray Apr 17 '12 at 16:13
  • @SvenMarnach Missed that one, fixed. – Gareth Latty Apr 17 '12 at 17:16
4

Late answer, and not a direct answer to the question, but if you end-up here from a "Sorting a defaultdict by value in python" google search, this is how I sort ( normal python dictionaries cannot be sorted, but they can be printed sorted) a defaultdict by its values:

orders = {
    'cappuccino': 54,
    'latte': 56,
    'espresso': 72,
    'americano': 48,
    'cortado': 41
}

sort_orders = sorted(orders.items(), key=lambda x: x[1], reverse=True)

for i in sort_orders:
    print(i[0], i[1])

Demo

Pedro Lobito
  • 94,083
  • 31
  • 258
  • 268