0

There are lots of great solutions to flattening a Python dictionary, but most recursive methods seem to automatically add a 'parent key' to key when the value is a dictionary (regardless of whether the nested key is-so far- unique) - like in this solution. This is the flattening function that automatically adds the parent key to the key when nested.

def flatten_dict(item, parent_key='', sep='_'):
    final = []
    for key, val in item.items():
        new_key = parent_key + sep + key if parent_key else key        
        if isinstance(val, dict):
            final.extend(flatten_dict(val, parent_key=new_key).items())
        else:        
            final.append((new_key, val))
    return dict(final)

I have tried to pass a set of 'used keys' through to flatten_dict and still get parent key added to key (I imagine that they are being passed through multiple times in the recursion and being marked as used). Is there a way using recursion to only add parent_key to key if key is not unique?

For example:

flatten_dict({'a':1, 'b': {'a': 1, 'c': 1}}) 

returns:

{'a': 1, 'b_a':1, 'b_c': 1}

But ideally I'd like:

{'a': 1, 'b_a': 1, 'c': 1} # because 'c' is unique 

Thanks for any help!

slevin886
  • 261
  • 2
  • 10

1 Answers1

2

I recommend using an "accumulator" dict as input parameter instead of a list. This enables efficient lookup whether the key already exists.

def flat_dict(d, acc=None, parent_key=None, sep="_"):
    out = dict() if acc is None else acc
    for k, v in d.items():
        if type(v) is dict:
            flat_dict(v, out, parent_key=str(k))
        else:
            if k in out:
                k = parent_key + sep + str(k)
            out[k] = v

    return out

If all your keys are already strings, you can of course drop the str casts.

clemisch
  • 967
  • 8
  • 18
  • That did it! Thank you- it was my first attempt at writing a recursive function – slevin886 Jul 30 '19 at 18:42
  • Mutable objects (list, dicts, sets, ...) are in general very useful for recursive functions. You can pass them around and mutate them in the "inner" recursive calls. **Beware** however of using `acc=dict()` or the like in the function definition, as this object is instantiated when defining the function, *not* every time when calling it. – clemisch Jul 30 '19 at 18:46