36

I'm trying to create a generic function that replaces dots in keys of a nested dictionary. I have a non-generic function that goes 3 levels deep, but there must be a way to do this generic. Any help is appreciated! My code so far:

output = {'key1': {'key2': 'value2', 'key3': {'key4 with a .': 'value4', 'key5 with a .': 'value5'}}} 

def print_dict(d):
    new = {}
    for key,value in d.items():
        new[key.replace(".", "-")] = {}
        if isinstance(value, dict):
            for key2, value2 in value.items():
                new[key][key2] = {}
                if isinstance(value2, dict):
                    for key3, value3 in value2.items():
                        new[key][key2][key3.replace(".", "-")] = value3
                else:
                    new[key][key2.replace(".", "-")] = value2
        else:
            new[key] = value
    return new

print print_dict(output)

UPDATE: to answer my own question, I made a solution using json object_hooks:

import json

def remove_dots(obj):
    for key in obj.keys():
        new_key = key.replace(".","-")
        if new_key != key:
            obj[new_key] = obj[key]
            del obj[key]
    return obj

output = {'key1': {'key2': 'value2', 'key3': {'key4 with a .': 'value4', 'key5 with a .': 'value5'}}}
new_json = json.loads(json.dumps(output), object_hook=remove_dots) 

print new_json
funnydman
  • 9,083
  • 4
  • 40
  • 55
Bas Tichelaar
  • 397
  • 1
  • 4
  • 6
  • 9
    Toanswer your own question, you answer your own question, not edit it. – Oleh Prypin Jul 28 '12 at 12:01
  • Use my solution because my solution is ten times faster. – horejsek Jul 28 '12 at 12:05
  • Great way of doing it. The object_hook really simplifies the whole thing, specially in my situation where I use a "key" named 'include' where it needs to recursively load extra JSON files to form a single multidimensional dictionary. – Talk2 Nov 27 '16 at 23:28
  • 1
    For some inexplicable reason, this object_hook method using the above remove_dots() only replaced *some* of the key names. I have some that kept the dot. Is it possible this is related to some strange ordering problem in the obj.keys() function? Do I need to make an ordered dict? I thought Python3 didn't have the dict ordering problem? – Craig Jackson Apr 25 '18 at 23:41

9 Answers9

45

Yes, there exists better way:

def print_dict(d):
    new = {}
    for k, v in d.iteritems():
        if isinstance(v, dict):
            v = print_dict(v)
        new[k.replace('.', '-')] = v
    return new

(Edit: It's recursion, more on Wikipedia.)

horejsek
  • 1,378
  • 14
  • 8
  • -1 because it doesn't replace the initial key, it adds a new one with the replaced character – bk0 Jan 31 '14 at 23:45
  • 2
    @bk0 It creates new dictionary. Initial key is not in returned new dictionary. – horejsek Feb 03 '14 at 10:54
  • 10
    This solution only works if all the values are dicts. It fails if a value is a list of dicts - the dicts in the list will not be reached. – aryeh Jul 01 '15 at 03:20
  • 2
    @aryeh Yes, that was also question. Can't write some universal solution for everything. :-) – horejsek Sep 28 '15 at 10:40
22

Actually all of the answers contain a mistake that may lead to wrong typing in the result.

I'd take the answer of @ngenain and improve it a bit below.

My solution will take care about the types derived from dict (OrderedDict, defaultdict, etc) and also about not only list, but set and tuple types.

I also do a simple type check in the beginning of the function for the most common types to reduce the comparisons count (may give a bit of speed in the large amounts of the data).

Works for Python 3. Replace obj.items() with obj.iteritems() for Py2.

def change_keys(obj, convert):
    """
    Recursively goes through the dictionary obj and replaces keys with the convert function.
    """
    if isinstance(obj, (str, int, float)):
        return obj
    if isinstance(obj, dict):
        new = obj.__class__()
        for k, v in obj.items():
            new[convert(k)] = change_keys(v, convert)
    elif isinstance(obj, (list, set, tuple)):
        new = obj.__class__(change_keys(v, convert) for v in obj)
    else:
        return obj
    return new

If I understand the needs right, most of users want to convert the keys to use them with mongoDB that does not allow dots in key names.

baldr
  • 2,891
  • 11
  • 43
  • 61
  • 1
    This one is the best. It supports both python2 and python3 but the section " if isinstance(obj, (str, int, float))" is not needed. It works without this line too. – F.Tamy Mar 11 '19 at 09:40
  • 6
    Nice answer. Just for completeness, I'll add the convert function for your answer: `def convert(k): return k.replace('.', '-')` – John Jan 06 '20 at 07:05
  • @F.Tamy ... it saves time when processing large dictionaries. – ZF007 Feb 21 '22 at 21:47
  • You can write type(obj)(something) instead of obj.__class__( – funnydman Jul 12 '22 at 13:03
8

I used the code by @horejsek, but I adapted it to accept nested dictionaries with lists and a function that replaces the string.

I had a similar problem to solve: I wanted to replace keys in underscore lowercase convention for camel case convention and vice versa.

def change_dict_naming_convention(d, convert_function):
    """
    Convert a nested dictionary from one convention to another.
    Args:
        d (dict): dictionary (nested or not) to be converted.
        convert_function (func): function that takes the string in one convention and returns it in the other one.
    Returns:
        Dictionary with the new keys.
    """
    new = {}
    for k, v in d.iteritems():
        new_v = v
        if isinstance(v, dict):
            new_v = change_dict_naming_convention(v, convert_function)
        elif isinstance(v, list):
            new_v = list()
            for x in v:
                new_v.append(change_dict_naming_convention(x, convert_function))
        new[convert_function(k)] = new_v
    return new
jllopezpino
  • 868
  • 2
  • 9
  • 17
  • It works unless d is not a dict so you can't call d.items(). My dict contains an array of strings, which fails when it recurses. Check in the root of the function for isinstance(d, dict) and if false, just return d. Then it should work for anything. – Mnebuerquo Nov 28 '18 at 20:03
7

Here's a simple recursive solution that deals with nested lists and dictionnaries.

def change_keys(obj, convert):
    """
    Recursivly goes through the dictionnary obj and replaces keys with the convert function.
    """
    if isinstance(obj, dict):
        new = {}
        for k, v in obj.iteritems():
            new[convert(k)] = change_keys(v, convert)
    elif isinstance(obj, list):
        new = []
        for v in obj:
            new.append(change_keys(v, convert))
    else:
        return obj
    return new
ngenain
  • 79
  • 1
  • 2
  • Good point, but it forces type cast from `dict`-derived classes back to `dict`. For example you may lose keys order for `OrderedDict`. I have published an improved answer based on your one. – baldr Jul 08 '16 at 15:08
  • For python3 use obj.items(): instead. – DropItLikeItsHot Oct 02 '20 at 22:34
2

You have to remove the original key, but you can't do it in the body of the loop because it will throw RunTimeError: dictionary changed size during iteration.

To solve this, iterate through a copy of the original object, but modify the original object:

def change_keys(obj):
    new_obj = obj
    for k in new_obj:
            if hasattr(obj[k], '__getitem__'):
                    change_keys(obj[k])
            if '.' in k:
                    obj[k.replace('.', '$')] = obj[k]
                    del obj[k]

>>> foo = {'foo': {'bar': {'baz.121': 1}}}
>>> change_keys(foo)
>>> foo
{'foo': {'bar': {'baz$121': 1}}}
bk0
  • 1,300
  • 10
  • 12
  • It gives the following error `TypeError: string indices must be integers` in `if hasattr(obj[k], '__getitem__'):` line. – Gürol Canbek Jun 05 '16 at 14:14
  • Instead of `hasattr(...)` try: `from collection import Mapping` and then `if isinstance(obj[k], Mapping)...`. This change has the same goal (trying to determine if the value is a [nested] dictionary), but should be more stable. – lnNoam May 21 '18 at 06:50
1

You can dump everything to a JSON replace through the whole string and load the JSON back

def nested_replace(data, old, new):
    json_string = json.dumps(data)
    replaced = json_string.replace(old, new)
    fixed_json = json.loads(replaced)
    return fixed_json

Or use a one-liner

def short_replace(data, old, new):
    return json.loads(json.dumps(data).replace(old, new))
Ariel Voskov
  • 326
  • 3
  • 7
  • This will replace string occurrences in both values as well as keys. The original answer asked for a solution for keys. If the replace was switched to a RegEx approach, it may be possible to apply only to keys. It's a brute force approach, though, not very memory efficient. – ingyhere Dec 15 '20 at 17:29
  • @ingyhere In my case I was converting XML to json and trying to strip out the @ symbol, and had no worries about affecting values. For me, this solution was terse and adequate. – fish Jan 05 '22 at 23:07
0

While jllopezpino's answer works but only limited to the start with the dictionary, here is mine that works with original variable is either list or dict.

def fix_camel_cases(data):
    def convert(name):
        # https://stackoverflow.com/questions/1175208/elegant-python-function-to-convert-camelcase-to-snake-case
        s1 = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', name)
        return re.sub('([a-z0-9])([A-Z])', r'\1_\2', s1).lower()

    if isinstance(data, dict):
        new_dict = {}
        for key, value in data.items():
            value = fix_camel_cases(value)
            snake_key = convert(key)
            new_dict[snake_key] = value
        return new_dict

    if isinstance(data, list):
        new_list = []
        for value in data:
            new_list.append(fix_camel_cases(value))
        return new_list

    return data
James Lin
  • 25,028
  • 36
  • 133
  • 233
0

Here's a 1-liner variant of @horejsek 's answer using dict comprehension for those who prefer:

def print_dict(d):
    return {k.replace('.', '-'): print_dict(v) for k, v in d.items()} if isinstance(d, dict) else d

I've only tested this in Python 2.7

ecoe
  • 4,994
  • 7
  • 54
  • 72
0

I am guessing you have the same issue as I have, inserting dictionaries into a MongoDB collection, encountering exceptions when trying to insert dictionaries that have keys with dots (.) in them.

This solution is essentially the same as most other answers here, but it is slightly more compact, and perhaps less readable in that it uses a single statement and calls itself recursively. For Python 3.

def replace_keys(my_dict):
    return { k.replace('.', '(dot)'): replace_keys(v) if type(v) == dict else v for k, v in my_dict.items() }