28

I have a nested OrderedDict I would like to convert to a dict. Applying dict() on it apparently only converts the outermost layer of the last entry.

from collections import OrderedDict

od = OrderedDict(
    [
        (u'name', u'Alice'),
        (u'ID', OrderedDict(
            [
                (u'type', u'card'),
                (u'nr', u'123')
            ]
        )),
        (u'name', u'Bob'),
        (u'ID', OrderedDict(
            [
                (u'type', u'passport'),
                (u'nr', u'567')
            ]
        ))
    ]
)

print(dict(od))

Output:

{u'name': u'Bob', u'ID': OrderedDict([(u'type', u'passport'), (u'nr', u'567')])}

Is there a direct method to convert all the occurences?

WoJ
  • 27,165
  • 48
  • 180
  • 345
  • Do you want to convert only `OrderedDict` instances? – Patrick Collins Jul 31 '14 at 08:19
  • Why do you want to convert it? You can use an `OrderedDict` pretty much anywhere a `dict` works. – jonrsharpe Jul 31 '14 at 08:21
  • @PatrickCollins: Sorry, I do not understand your question. I would like all the OrderedDicts to be converted to dicts, for all the elements (I clarified this in my question a few seconds ago) – WoJ Jul 31 '14 at 08:21
  • @jonrsharpe: I will get a huge OrderedDict (hundreds of megs to a few gigs) and I read that the memory overhead is large (about twice). Since I do no need the order I would at least chomp on that to keep it manageable. – WoJ Jul 31 '14 at 08:24

7 Answers7

45

Simplest solution is to use json dumps and loads

from json import loads, dumps
from collections import OrderedDict

def to_dict(input_ordered_dict):
    return loads(dumps(input_ordered_dict))

NOTE: The above code will work for dictionaries that are known to json as serializable objects. The list of default object types can be found here

So, this should be enough if the ordered dictionary do not contain special values.

EDIT: Based on the comments, let us improve the above code. Let us say, the input_ordered_dict might contain custom class objects that cannot be serialized by json by default. In that scenario, we should use the default parameter of json.dumps with a custom serializer of ours.

(eg):

from collections import OrderedDict as odict
from json import loads, dumps

class Name(object):
    def __init__(self, name):
        name = name.split(" ", 1)
        self.first_name = name[0]
        self.last_name = name[-1]

a = odict()
a["thiru"] = Name("Mr Thiru")
a["wife"] = Name("Mrs Thiru")
a["type"] = "test" # This is by default serializable

def custom_serializer(obj):
    if isinstance(obj, Name):
        return obj.__dict__

b = dumps(a) 
# Produces TypeError, as the Name objects are not serializable
b = dumps(a, default=custom_serializer)
# Produces desired output

This example can be extended further to a lot bigger scope. We can even add filters or modify the value to our necessity. Just add an else part to the custom_serializer function

def custom_serializer(obj):
    if isinstance(obj, Name):
        return obj.__dict__
    else:
        # Will get into this if the value is not serializable by default 
        # and is not a Name class object
        return None

The function that is given at the top, in case of custom serializers, should be:

from json import loads, dumps
from collections import OrderedDict

def custom_serializer(obj):
    if isinstance(obj, Name):
        return obj.__dict__
    else:
        # Will get into this if the value is not serializable by default 
        # and is also not a Name class object
        return None

def to_dict(input_ordered_dict):
    return loads(dumps(input_ordered_dict, default=custom_serializer))
thiruvenkadam
  • 4,170
  • 4
  • 27
  • 26
  • 5
    This will break if `repr` isn't properly defined for any object in your dictionary. – Patrick Collins Apr 09 '15 at 21:28
  • 1
    It will also break or if the dictionary contains any objects whose constructors are not currently in your scope, or if it contains objects whose constructors are in your scope, but under a different name. – Patrick Collins Apr 09 '15 at 21:35
  • this also breaks if your dict contains items that are not json serializable. – Paul T. Sep 04 '17 at 19:43
7

This should work:

import collections

def deep_convert_dict(layer):
    to_ret = layer
    if isinstance(layer, collections.OrderedDict):
        to_ret = dict(layer)

    try:
        for key, value in to_ret.items():
            to_ret[key] = deep_convert_dict(value)
    except AttributeError:
        pass

    return to_ret

Although, as jonrsharpe mentioned, there's probably no reason to do this -- an OrderedDict (by design) works wherever a dict does.

Patrick Collins
  • 10,306
  • 5
  • 30
  • 69
  • 3
    Thank you - it works for my example. I will have to think about how to accommodate lists of dicts (which I discovered I also have in the data I get) – WoJ Jul 31 '14 at 08:42
  • @WoJ this solution works for any kind of nested iterables, regardless of whether they're dictionaries or not. – Patrick Collins Apr 09 '15 at 21:36
  • @PatrickCollins they would like to _recurse_ through lists of dicts – jberryman Sep 01 '19 at 23:06
3

You should leverage Python's builtin copy mechanism.

You can override copying behavior for OrderedDict via Python's copyreg module (also used by pickle). Then you can use Python's builtin copy.deepcopy() function to perform the conversion.

import copy
import copyreg
from collections import OrderedDict

def convert_nested_ordered_dict(x):
    """
    Perform a deep copy of the given object, but convert
    all internal OrderedDicts to plain dicts along the way.

    Args:
        x: Any pickleable object

    Returns:
        A copy of the input, in which all OrderedDicts contained
        anywhere in the input (as iterable items or attributes, etc.)
        have been converted to plain dicts.
    """
    # Temporarily install a custom pickling function
    # (used by deepcopy) to convert OrderedDict to dict.
    orig_pickler = copyreg.dispatch_table.get(OrderedDict, None)
    copyreg.pickle(
        OrderedDict,
        lambda d: (dict, ([*d.items()],))
    )
    try:
        return copy.deepcopy(x)
    finally:
        # Restore the original OrderedDict pickling function (if any)
        del copyreg.dispatch_table[OrderedDict]
        if orig_pickler:
            copyreg.dispatch_table[OrderedDict] = orig_pickler

Merely by using Python's builtin copying infrastructure, this solution is superior to all other answers presented here, in the following ways:

  • Works for more than just JSON data.

  • Does not require you to implement special logic for each possible element type (e.g. list, tuple, etc.)

  • deepcopy() will properly handle duplicate references within the collection:

    x = [1,2,3]
    d = {'a': x, 'b': x}
    assert d['a'] is d['b']
    
    d2 = copy.deepcopy(d)
    assert d2['a'] is d2['b']
    

    Since our solution is based on deepcopy() we'll have the same advantage.

  • This solution also converts attributes that happen to be OrderedDict, not only collection elements:

    class C:
        def __init__(self, a):
            self.a = a
    
        def __repr__(self):
            return f"C(a={self.a})"
    
    c = C(OrderedDict([(1, 'one'), (2, 'two')]))
    print("original: ", c)
    print("converted:", convert_nested_ordered_dict(c))
    
    original:  C(a=OrderedDict([(1, 'one'), (2, 'two')]))
    converted: C(a={1: 'one', 2: 'two'})
    
Stuart Berg
  • 17,026
  • 12
  • 67
  • 99
2

NOTE: This answer is only partially correct, check https://stackoverflow.com/a/25057250/1860929 to understand more about why the dicts are of same sizes.

Original Answer

This doesn't answer the question of the conversion, its more about what needs to be done.

The basic assumption that an OrderedDict is twice the size of Dict is flawed. Check this:

import sys
import random
from collections import OrderedDict

test_dict = {}
test_ordered_dict = OrderedDict()

for key in range(10000):
    test_dict[key] = random.random()
    test_ordered_dict[key] = random.random()

sys.getsizeof(test_dict)
786712

sys.getsizeof(test_ordered_dict)
786712

Basically both are of same size.

However, the time taken for the operations are not same, and in fact, creating a large dictionary (with 100-10000 keys) is around 7-8x faster than creating an OrderedDict with same keys. (Verified using %timeit in ipython)

import sys
import random
from collections import OrderedDict


def operate_on_dict(r):
    test_dict = {}
    for key in range(r):
        test_dict[key] = random.random()

def operate_on_ordered_dict(r):
    test_ordered_dict = OrderedDict()
    for key in range(r):
        test_ordered_dict[key] = random.random()

%timeit for x in range(100): operate_on_ordered_dict(100)
100 loops, best of 3: 9.24 ms per loop

%timeit for x in range(100): operate_on_dict(100)
1000 loops, best of 3: 1.23 ms per loop

So, IMO, you should focus on reading data directly into a dict and operate upon it, rather than first creating an OrderedDict and then converting it to a dict repetitively.

Community
  • 1
  • 1
Anshul Goyal
  • 73,278
  • 37
  • 149
  • 186
  • Interesting. I based my opinion on [another SO answer](http://stackoverflow.com/a/18951209/903011) which stated that _"(an `OrderedDict`) is not a lot slower, but at least doubles the memory over using a plain `dict`"_. Nevertheless I do not have a choice as the `OrderedDict` is returned from a function I do not have control over. – WoJ Jul 31 '14 at 09:43
  • @WoJ My turn to say interesting :) That answer is from Tim Peters who wrote TimSort, and I am really trumped right now. Have asked a [question](http://stackoverflow.com/q/25056387/1860929) on the same. – Anshul Goyal Jul 31 '14 at 10:21
0

I wrote a recursive method to convert an OrderedDict to a simple dict.

def recursive_ordered_dict_to_dict(ordered_dict):
    simple_dict = {}

    for key, value in ordered_dict.items():
        if isinstance(value, OrderedDict):
            simple_dict[key] = recursive_ordered_dict_to_dict(value)
        else:
            simple_dict[key] = value

    return simple_dict

Note: OrderedDicts and dicts are usually interchangeable, but I ran into an issue when running an assert between the two types using pytest.

vpontis
  • 733
  • 8
  • 15
0

Here's a version that also handles lists and tuples. In this comment the OP mentions that lists of dicts also is also a case to handle.

Note, this also converts the tuples to lists. Preserving tuples is left as an excercise for the reader :)

def od2d(val):                                                                  
  if isinstance(val, (OrderedDict, dict)):                                    
      return {k: od2d(v) for k, v in val.items()}                             
  elif isinstance(val, (tuple, list)):                                        
      return [od2d(v) for v in val]                                           
  else:                                                                       
      return val 
Michael Dorner
  • 17,587
  • 13
  • 87
  • 117
rrauenza
  • 6,285
  • 4
  • 32
  • 57
0

This code should work with nested lists.

def nested_convert_to_dict(input: [dict, collections.OrderedDict]):
    if isinstance(input, collections.OrderedDict):
        res = dict(input)
    else:
        res = input
    try:
        for key, value in res.items():
            res[key] = nested_convert_to_dict(value)
            if isinstance(value, list):
                new_value = []
                for item in value:
                    if isinstance(item, collections.OrderedDict):
                        item = nested_convert_to_dict(item)
                    new_value.append(item)
                res[key] = new_value
    except AttributeError:
        pass
    return res