3

Suppose I have a dictionary in python and I sort the keys by values as follows

my_dict = {'a':5, 'b':4, 'c':6, 'd':3, 'e':2}
sorted_list = sorted(my_dict, key=my_dict.get)
sorted_list
['e', 'd', 'b', 'a', 'c']

This works fine, but if I have collisions in my dictionary (with respect to values)

my_dict = {'a':5, 'b':4, 'c':5, 'd':3, 'e':2, 'f':4, 'g':3}
sorted_list = sorted(my_dict, key=my_dict.get)
sorted_list
['e', 'd', 'g', 'b', 'f', 'a', 'c']

It seems to me that in case of duplicates, the list is sorted by its order(numerical order or insertion order, not the sorted order) in the dictionary. Can I assume it to be true for all cases?

EDIT1: My keys will be integers and in case the values collide, I want the resulting keys to be sorted by their values(highest first)

EDIT2: Adding the following parameter helps my question. But, in my case, I want the values to be sorted in ascending, but the keys to be sorted in descending. THe following will result in both to descending key=lambda x: (x[1],x[0])

Himanshu Jindal
  • 637
  • 1
  • 7
  • 13
  • 1
    For ascending then descending make the second item negative: `sorted(my_dict, key=lambda k: (my_dict[k],-k)`. Note that this will just give you sorted keys. If you need sorted key-value pairs, do: `sorted(my_dict.iteritems(), key=lambda item: (item[0], -item[1]))` – Steven Rumbalski Sep 16 '12 at 15:41

3 Answers3

6

Python's sort algorithm (TimSort) is a stable sort, so any items that have the same sort 'value', are kept in the order they were in before sorting.

When sorting dictionary keys, that means they are kept in the same order as returned by dict.keys() or when iterating over the dictionary. Do note that this ordering is arbitrary, and not stable across changes to the dictionary.

Otherwise yes, you can assume that to be true for all cases.

As pointed out by Steven Rumbalski, you can take the key along with the sorting to stabilize same-value sorting ordering:

sorted_list = sorted(my_dict, key=lambda k: (my_dict[k], k))
Community
  • 1
  • 1
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • It depends upon what he means by "its order in the dictionary." If he means the order that is returned by `dict.keys()` then, yes, it is true. But if he means the order the appear in the dict literal, then the assumption is incorrect. His second example is open to both interpretations. – Steven Rumbalski Sep 15 '12 at 22:11
  • @StevenRumbalski: I qualified my 'yes' a little, I realized the OP may have mistakenly thought his dict literal definition might have specified an ordering. – Martijn Pieters Sep 15 '12 at 22:13
  • I made the question a bit more clear. I meant insertion order or numerical order (not sorted order) – Himanshu Jindal Sep 15 '12 at 22:18
  • @HimanshuJindal: dictionaries *have no defined order*. This is the nature of a hash table (the underlying data structure of a dictionary). Thus, the ordering of `.keys()` is dependent on the hash values of each key, and inserting and removing keys changes that order as the dictionary table can be resized or shrunk in it's lifetime. – Martijn Pieters Sep 15 '12 at 22:20
  • 1
    @HimanshuJindal: If you care about insertion order, use an OrderedDict, which will return keys in the same order in which they were inserted. – Steven Rumbalski Sep 15 '12 at 22:20
  • When sorting dictionary keys, that means they are kept in the same order as returned by dict.keys() or when iterating over the dictionary. Do note that this ordering is arbitrary, and not stable across changes to the dictionary. So, does this mean that the return order can be totally arbitrary or are they sorted by the value of the key (The keys in my case will be int) – Himanshu Jindal Sep 15 '12 at 22:21
  • 2
    To be sorted by the value of the key, your key function will need to return the value: `key=lambda k:(my_dict[k], k)` – Steven Rumbalski Sep 15 '12 at 22:23
  • @HimanshuJindal: the return order of the sort can be arbitrary. It'll be stable as long as the dictionary is not manipulated. – Martijn Pieters Sep 15 '12 at 22:25
1

It is not true. When the values are the same, the dictionary is sorted on keys. Here is the modified example.

my_dict = {'a':5, 'b':4, 'c':5, 'z':3, 'e':2, 'f':4, 'g':3}
sorted_list = sorted(my_dict, key=my_dict.get)
print sorted_list
Geordee Naliyath
  • 1,799
  • 17
  • 28
  • This counter example may not produce the same result in different pythons. Also, by "sorted on keys" do you mean the *order* returned by `my_dict.keys()` or do you mean by the keys value? – Steven Rumbalski Sep 15 '12 at 22:18
  • Is there any reference for this? Because I will be running this on a huge data and I want to be sure if this will be correct – Himanshu Jindal Sep 15 '12 at 22:19
  • 1
    `sorted_list` is `['e', 'g', 'z', 'b', 'f', 'a', 'c']` in my Python 2.7.3 and `['e', 'g', 'z', 'b', 'f', 'c', 'a']` in my Jython 2.5.1+. As @StevenRumbalski says, this varies from Python implementation to Python implementation, and since it's not guaranteed by anything, could vary with the phase of the moon. – DSM Sep 15 '12 at 22:23
  • It is better to be explicit then - as in the solution given for http://stackoverflow.com/questions/7742752/sorting-a-dictionary-by-value-then-by-key – Geordee Naliyath Sep 15 '12 at 22:29
0

This is probably close to what you're looking for, using pypy. The treap stuff is at http://stromberg.dnsalias.org/~strombrg/treap/ Note that I've flipped the keys and values; in some problems that's beneficial, while in yours it may or may not be.

#!/usr/local/pypy-1.8/bin/pypy

import py_treap

class Thing:
    def __init__(self, number):
        self.number = number

    def __cmp__(self, other):
        return -cmp(self.number, other.number)

    def __str__(self):
        return str(self.number)

def main():
    list_ = [ ('a', 5), ('b', 4), ('c', 6), ('d', 3), ('e', 2), ('f', 4), ('g', 4), ('h', 3) ]
    t = py_treap.treap()
    for string, integer in list_:
        thing = Thing(integer)
        if not thing in t:
            print '%s not in t' % thing
            t[thing] = [ string ]
        else:
            print '%s in t' % thing
            t[thing].append(string)

    for sublist in t.values():
        sublist.sort()

    for key, value in t.items():
        print key, value

main()
dstromberg
  • 6,954
  • 1
  • 26
  • 27