1

So I have this dictionary, which I'm converting to OrderedDict. On conversion, I'd like to sort the OrderedDict by the order in which its keys appear in a separate tuple. Any other values I'd like to append to the end of the OrderedDict. Nonexistent keys in the ordering tuple should be ignored.

I think I have most of it, but I'm having trouble wrapping my brain around the lambda sorted functions. Can you help me iron it out?

from collections import OrderedDict

d = {
    'spam': 'tasty',
    'subtitle': 'A Subtitle',
    'title': 'Test Title',
    'foo': 'bar',
}

key_order = ('title', 'subtitle', 'non_in_dictionary')
ordered_dict = OrderedDict(sorted(d.items(), key=lambda ???? ))

should produce ordered_dict:

{
    'title': 'Test Title',
    'subtitle': 'A Subtitle',
    'spam': 'tasty',
    'foo': 'bar',
}
allanberry
  • 7,325
  • 6
  • 42
  • 71
  • What order do you intend for the keys that aren't in your `key_order` tuple? – tzaman May 28 '15 at 19:43
  • @jonrsharpe, my issue is I don't know *what* to try; I'm at a loss. – allanberry May 28 '15 at 19:47
  • @tzaman, I'm not picky. Alphabetical, I suppose. I'm bringing in the data actually from a JSON file (with inherent order, although JSON doesn't care), so I'm putting `spam` before `foo` for that reason. – allanberry May 28 '15 at 19:49
  • 1
    @niteshade only just seen you mention *I'm bringing in the data actually from a JSON file* - look at using `object_pairs_hook` when loading the JSON - see [this post](http://stackoverflow.com/questions/6921699/can-i-get-json-to-load-into-an-ordereddict-in-python) for examples. That'd be a far more efficient solution if you know your attributes are in the correct order in the JSON data. – Jon Clements May 28 '15 at 21:02
  • @JonClements, thanks very much, looks good. I'll keep this in mind for future reference. Most of the JSON elements are in the correct order, but I still need to pull a few to the top, so I'll have some sorting to do anyways. I really appreciate it regardless. – allanberry May 28 '15 at 21:47

3 Answers3

4

Another method which is a bit simpler than sorting is to create an OrderedDict from the titles, then update the OrderedDict from your original dict:

from collections import OrderedDict

d = {
    'spam': 'tasty',
    'subtitle': 'A Subtitle',
    'title': 'Test Title',
    'foo': 'bar',
}

key_order = ('title', 'subtitle')
od = OrderedDict((k, d[k]) for k in key_order)
od.update(d)

Or as tzaman suggests in comments, as the values will always be set on the following update, you can construct the original OrderedDict with:

od = OrderedDict.fromkeys(key_order)
od.update(d)
Community
  • 1
  • 1
Jon Clements
  • 138,671
  • 33
  • 247
  • 280
  • 3
    A little simpler: `od = OrderedDict.fromkeys(key_order)` -- the values don't matter. – tzaman May 28 '15 at 19:47
  • @Tzaman ahhh... clever - the values will get updated from the following `.update` :) – Jon Clements May 28 '15 at 19:48
  • @Sir_FZ not for the initial creation. They'll get set during the `update` call. – tzaman May 28 '15 at 19:48
  • Good catch. Though I thought you meant they didn't matter at all – sirfz May 28 '15 at 19:49
  • would this be appreciably slower than ordering the OD at creation? This function will be chewing through HUGE dictionaries, so I'd like it as efficient as possible. (that's why I assumed the lambda sort would be best.) – allanberry May 28 '15 at 19:58
  • 1
    @niteshade: It shouldn't be appreciably slower; if anything, I'd expect it to be faster. – user2357112 May 28 '15 at 20:00
  • 1
    @niteshade dictionary updates are fast - I'd personally expect it to out perform creating an index lookup, then sorting the `dict`s items against that lookup using a lambda... thus creating a `list` that gets put into an OD. Where the `key_order` is a large subset of the original `dict` then they'll be unnecessary replacements - but that'll be on key updates so should be fast and avoid the memory cost of creating a list. – Jon Clements May 28 '15 at 20:05
  • 1
    @niteshade this will almost certainly be faster, particularly with the `fromkeys` formulation. The first pass will just set the key order, and the second pass will iterate over your source dictionary once; no sorting at all so it'll theoretically be `O(n)` instead of `O(n lg n)` which is a big win for huge dicts. `sorted` will also be creating an intermediate copy which is bad memory-wise. – tzaman May 28 '15 at 20:07
  • OK, excellent. Thanks. The one problem I'm having is handling keys not in the dict (not your fault, I know... I should edit the question). Like, if `title` doesn't exist in the `dict`. Can you think of any easy way to fix? – allanberry May 28 '15 at 20:09
  • 1
    @niteshade `od = OrderedDict.fromkeys(filter(d.__contains__, key_order))`. Or in the original version: `od = OrderedDict((k, d[k]) for k in key_order if k in d)` – tzaman May 28 '15 at 20:14
0
ordered_dict = OrderedDict(sorted(dict.items(),
               key=lambda x: key_order.index(x[0])if x[0] in key_order else 10e10 ))

This uses a key of the index of the given item in the key order. If it isn't found, it uses a large number instead, putting it at the end. It produces the following OrderedDict:

OrderedDict([('title', 'Test Title'), ('subtitle', 'A Subtitle'), ('foo', 'bar'), ('spam', 'tasty')])

foo is before spam because the original dictionary had an arbitrary order.

TigerhawkT3
  • 48,464
  • 6
  • 60
  • 97
-1
from collections import OrderedDict

d = {
    'spam': 'tasty',
    'subtitle': 'A Subtitle',
    'title': 'Test Title',
    'foo': 'bar',
}

key_order = ('title', 'subtitle')
key_order = OrderedDict((k, i) for i, k in enumerate(key_order))
ordered_dict = OrderedDict(sorted(d.iteritems(), key=lambda k: key_order.get(k[0], float("inf"))))

Use items() instead of iteritems() in python 3.

sirfz
  • 4,097
  • 23
  • 37