3

Let's say I have a Python dict that may contain other dicts nested to an arbitrary level. Also, some of the keys refer to boolean choices while others don't. Something like this:

{'Key1': 'none',
 'Key2': {'Key2a': True, 'Key2b': False},
 'Key3': {'Key3a': {'Key3a1': 'some', 'Key3a2': 'many'}, 'Key3b': True}}

What I'd like to do is transform it into this:

{'Key1_none': 1,
 'Key2_Key2a': 1,
 'Key2_Key2b': 0,
 'Key3_Key3a_Key3a1_some': 1,
 'Key3_Key3a_Key3a2_many': 1,
 'Key3b': 1}

Now now only is the dict flattened, all of the keys now have boolean answers. This solution is a great start, but I'm not that familiar with Python. The solution I linked to handles most of the cases, but it doesn't drill-down to the value level in all cases. With the example above, it would leave the first part as:

{'Key1': 'none',
 'Key2_Key2a': 1,
 'Key2_Key2b': 0,
 ...}

Obviously replacing the True/False with 1/0 is trivial. My question is more about how to flatten down to the additional level when the value of the key is not True or False.

Community
  • 1
  • 1
TheOriginalBMan
  • 237
  • 2
  • 3
  • 12

2 Answers2

2

I think this recursive function should do it:

def flatten(d, prefix=''):
    prefix = prefix + '_' if prefix else ''
    new = {}
    for k, v in d.items():
        if isinstance(v, bool):
            new[prefix + k] = int(v)
        elif isinstance(v, dict):
            new.update(flatten(v, prefix=prefix + k))
        elif isinstance(v, basestring):  # python3 -- str
            new[prefix + k + '_' + v] = 1
        else:
            raise TypeError('Unknown item type.')
    return new

the recursion happens if the value is a dict and the "prefix" for the keys is appended to whatever the previous level of nesting's prefix was.

Of course, you can do better by using proper ABCs in the isinstance checking... e.g.

bool -> numbers.Integral
dict -> collections.Mapping
mgilson
  • 300,191
  • 65
  • 633
  • 696
  • Thanks! I'm curious about the line that you've commented with `# python3 -- str`. I'm using Python 2.7. What were you referring to here? – TheOriginalBMan Jan 16 '15 at 21:04
  • On python2.x there are 2 distinct string types (`str` and `unicode`) that both inherit from `basestring` so that the `isinstance` check there will pass with either a `str` or a `unicode` object. On python3.x, the type `unicode` disappears (sort of ...) and _all_ strings act like python2.x's `unicode`. – mgilson Jan 16 '15 at 21:07
2

One more solution, based on solution you're provided

import collections

def flatten(d, parent_key='', sep='_'):
    items = []
    for k, v in d.items():
        new_key = parent_key + sep + k if parent_key else k
        if isinstance(v, collections.MutableMapping):
            items.extend(flatten(v, new_key, sep=sep).items())
        else:
            if isinstance(v, bool):
                items.append((new_key, int(v)))
            else:
                items.append((new_key, 1))
    return dict(items
Community
  • 1
  • 1
Nikolai Golub
  • 3,327
  • 4
  • 31
  • 61