227

There's an existing function that ends in the following, where d is a dictionary:

return d.iteritems()

that returns an unsorted iterator for a given dictionary. I would like to return an iterator that goes through the items sorted by key. How do I do that?

jpp
  • 159,742
  • 34
  • 281
  • 339
mike
  • 46,876
  • 44
  • 102
  • 112

10 Answers10

184

Haven't tested this very extensively, but works in Python 2.5.2.

>>> d = {"x":2, "h":15, "a":2222}
>>> it = iter(sorted(d.iteritems()))
>>> it.next()
('a', 2222)
>>> it.next()
('h', 15)
>>> it.next()
('x', 2)
>>>

If you are used to doing for key, value in d.iteritems(): ... instead of iterators, this will still work with the solution above

>>> d = {"x":2, "h":15, "a":2222}
>>> for key, value in sorted(d.iteritems()):
>>>     print(key, value)
('a', 2222)
('h', 15)
('x', 2)
>>>

With Python 3.x, use d.items() instead of d.iteritems() to return an iterator.

jpp
  • 159,742
  • 34
  • 281
  • 339
  • 29
    use `.items()` instead of `iteritems()`: as @Claudiu said, iteritems does not work for Python 3.x, but `items()` is available from Python 2.6. – Remi Oct 01 '11 at 17:30
  • 40
    This is not obvious. In fact, `items()` creates a list and therefore uses memory, whereas `iteritems()` essentially does not use memory. What to use mostly depend on the size of the dictionary. Furthermore, the automatic Python 2 to Python 3 conversion tool (`2to3`) automatically takes care of the conversion from `iteritems()` to `items()`, so there is no need to worry about this. – Eric O. Lebigot Dec 09 '12 at 11:48
  • Is it possible to optimize it in case both getitem and sorted iteration are very frequently needed? – HoverHell Dec 16 '12 at 19:15
  • 6
    @HowerHell use a `collections.OrderedDict` then you sort once & get items in sorted order always. – Mark Harviston Jun 03 '13 at 15:23
  • 9
    But @EOL, even if `iteritems()` does not use memory, everything must be pulled into memory for `sorted()`, so there is no difference between the use of `items()` and `iteritems()` here memory-wise. – Richard Oct 04 '13 at 14:49
  • 8
    @Richard: While it is true that all the elements must be pulled into memory, they are *stored* twice with `items()` (in the list returned by `items()`, and in the sorted list) and only once with `iteritems()` (in the sorted list only). – Eric O. Lebigot Oct 07 '13 at 08:38
86

Use the sorted() function:

return sorted(dict.iteritems())

If you want an actual iterator over the sorted results, since sorted() returns a list, use:

return iter(sorted(dict.iteritems()))
Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
  • That fails for me: : iter() returned non-iterator of type 'list' – mike Dec 13 '08 at 00:04
  • That's probably because you use "dict" as the variable name. "dict" is actually the type name of dictionaries. Just use another name like "mydict" here and voila. – utku_karatas Dec 13 '08 at 00:17
  • 1
    Still not working. Are you positive sorted() returns another iterator, as opposed to a regular list? – mike Dec 13 '08 at 00:26
  • when and where does this exception occur? you can iterate over a list without a problem –  Dec 13 '08 at 00:47
  • it could fail if you were calling the .next() method on the list, which would work on the iterator –  Dec 13 '08 at 00:51
  • well, i somehow took that problem as trivial ;) –  Dec 13 '08 at 13:40
  • 1
    Agreed, hop. I don't think I ever call .next() directly except when skipping lines in files. Our iter(sorted(dict.iteritems())) solution ends up making a copy of the whole dict in memory at the "sorted(" stage anyway, so the primary iterator benefit seems lost :) –  Dec 13 '08 at 14:51
  • It needs to plug into an existing API that's expecting an iterator. – mike Dec 16 '08 at 00:38
42

A dict's keys are stored in a hashtable so that is their 'natural order', i.e. psuedo-random. Any other ordering is a concept of the consumer of the dict.

sorted() always returns a list, not a dict. If you pass it a dict.items() (which produces a list of tuples), it will return a list of tuples [(k1,v1), (k2,v2), ...] which can be used in a loop in a way very much like a dict, but it is not in anyway a dict!

foo = {
    'a':    1,
    'b':    2,
    'c':    3,
    }

print foo
>>> {'a': 1, 'c': 3, 'b': 2}

print foo.items()
>>> [('a', 1), ('c', 3), ('b', 2)]

print sorted(foo.items())
>>> [('a', 1), ('b', 2), ('c', 3)]

The following feels like a dict in a loop, but it's not, it's a list of tuples being unpacked into k,v:

for k,v in sorted(foo.items()):
    print k, v

Roughly equivalent to:

for k in sorted(foo.keys()):
    print k, foo[k]
Peter Rowell
  • 17,605
  • 2
  • 49
  • 65
  • Okay, but I don't want a Dict or a List, I want an Iterator. How do i coerce it into being an Iterator? – mike Dec 13 '08 at 00:47
  • 2
    `sorted(foo.keys())` is better as the equivalent `sorted(foo)`, since dictionaries return their keys when iterated over (with the advantage of not being forced to create the `foo.keys()` intermediate list, maybe—depending on how `sorted()` is implemented for iterables). – Eric O. Lebigot Jul 28 '14 at 09:24
  • Wonder which is better for speed and/or memory `k in sorted(foo.keys()):` which pulls the keys or `for k,v in sorted(foo.items()):` which returns a copy of the dictionary’s list pairs I would guess `sorted(foo.keys())` – CrandellWS Oct 18 '15 at 02:31
  • 1
    @CrandellWS: The best way to answer the time question is with the Python [timeit](https://docs.python.org/2/library/timeit.html) module. – Peter Rowell Oct 18 '15 at 17:12
  • @peter Rowell thank you did not even know about `timeit` I will give it a go. – CrandellWS Oct 18 '15 at 18:54
  • @PeterRowell I did run several tests had to back it down to 250k and found no clear winner, though was able to run it at a million and from what I can see I will use `k in sorted(foo.keys()):` see ->https://github.com/CrandellWS/Udacity-Nanodegree/commit/06aa1ebd95664c1f6f1a75e2ef87f3d9346c207b – CrandellWS Oct 18 '15 at 20:22
  • @CrandellWS: Glad that was useful for you. Other things to consider before getting too far gone into Optimization Land include: How often is this routine called? If the input changes infrequently can you cache the result? Etc. Actual testing and timing are the only way to answer that eternal question: "It's 11pm. Do you know where *your* program is spending its time?" – Peter Rowell Oct 19 '15 at 14:54
  • @PeterRowell When I've been using dicts in python 3, however, and I call `for k,v in d.items()`, it's always returned the keys in the order I inserted them. Is this something I can count on? – Nathan majicvr.com May 30 '18 at 17:23
  • 1
    @frank -- Short Answer: No. A dict is an array with the actual key being a *hash* of the value of the supplied key. Although some implementations may be fairly predictable, and some may even make this contract, I count on *nothing* when it comes to hash ordering. See [this post](https://stackoverflow.com/questions/39980323/are-dictionaries-ordered-in-python-3-6) for more on 3.6+ behavior. In particular note the first answer. – Peter Rowell May 30 '18 at 20:37
35

Greg's answer is right. Note that in Python 3.0 you'll have to do

sorted(dict.items())

as iteritems will be gone.

Claudiu
  • 224,032
  • 165
  • 485
  • 680
7

You can now use OrderedDict in Python 2.7 as well:

>>> from collections import OrderedDict
>>> d = OrderedDict([('first', 1),
...                  ('second', 2),
...                  ('third', 3)])
>>> d.items()
[('first', 1), ('second', 2), ('third', 3)]

Here you have the what's new page for 2.7 version and the OrderedDict API.

Caumons
  • 9,341
  • 14
  • 68
  • 82
6

In general, one may sort a dict like so:

for k in sorted(d):
    print k, d[k]

For the specific case in the question, having a "drop in replacement" for d.iteritems(), add a function like:

def sortdict(d, **opts):
    # **opts so any currently supported sorted() options can be passed
    for k in sorted(d, **opts):
        yield k, d[k]

and so the ending line changes from

return dict.iteritems()

to

return sortdict(dict)

or

return sortdict(dict, reverse = True)
pythonlarry
  • 2,316
  • 2
  • 20
  • 17
5
>>> import heapq
>>> d = {"c": 2, "b": 9, "a": 4, "d": 8}
>>> def iter_sorted(d):
        keys = list(d)
        heapq.heapify(keys) # Transforms to heap in O(N) time
        while keys:
            k = heapq.heappop(keys) # takes O(log n) time
            yield (k, d[k])


>>> i = iter_sorted(d)
>>> for x in i:
        print x


('a', 4)
('b', 9)
('c', 2)
('d', 8)

This method still has an O(N log N) sort, however, after a short linear heapify, it yields the items in sorted order as it goes, making it theoretically more efficient when you do not always need the whole list.

jamylak
  • 128,818
  • 30
  • 231
  • 230
4

If you want to sort by the order that items were inserted instead of of the order of the keys, you should have a look to Python's collections.OrderedDict. (Python 3 only)

gecco
  • 17,969
  • 11
  • 51
  • 68
3

sorted returns a list, hence your error when you try to iterate over it, but because you can't order a dict you will have to deal with a list.

I have no idea what the larger context of your code is, but you could try adding an iterator to the resulting list. like this maybe?:

return iter(sorted(dict.iteritems()))

of course you will be getting back tuples now because sorted turned your dict into a list of tuples

ex: say your dict was: {'a':1,'c':3,'b':2} sorted turns it into a list:

[('a',1),('b',2),('c',3)]

so when you actually iterate over the list you get back (in this example) a tuple composed of a string and an integer, but at least you will be able to iterate over it.

pcn
  • 1,007
  • 7
  • 2
2

Assuming you are using CPython 2.x and have a large dictionary mydict, then using sorted(mydict) is going to be slow because sorted builds a sorted list of the keys of mydict.

In that case you might want to look at my ordereddict package which includes a C implementation of sorteddict in C. Especially if you have to go over the sorted list of keys multiple times at different stages (ie. number of elements) of the dictionaries lifetime.

http://anthon.home.xs4all.nl/Python/ordereddict/

Anthon
  • 69,918
  • 32
  • 186
  • 246