3

How do I filter a nested dictionary in python based on key values:

d = {'data': {'country': 'US', 'city': 'New York', 'state': None},
     'tags': ['US', 'New York'],
     'type': 'country_info',
     'growth_rate': None
     }

I want to filter this dictionary to eliminate NoneType values so the resulting dict should be:

d = {'data': {'country': 'US', 'city': 'New York'},
     'tags': ['US', 'New York'],
     'type': 'country_info',
     }

Also, the dict can have multiple levels of nesting. I want to remove all NoneType values from the dict.

Gareth Latty
  • 86,389
  • 17
  • 178
  • 183
avenger
  • 131
  • 2
  • 11

2 Answers2

10

You can define this recursively pretty easily with a dict comprehension.

def remove_keys_with_none_values(item):
    if not hasattr(item, 'items'):
        return item
    else:
        return {key: remove_keys_with_none_values(value) for key, value in item.items() if value is not None}

Recursion isn't too optimised in Python, but given the relatively small number of nestings that are likely, I wouldn't worry.

Looking before we leap isn't too Pythonic, I think it is a better option than catching the exception - as it's likely that the value will not be a dict most of the time (it is likely we have more leaves than branches).

Also note that in Python 2.x, you probably want to swap in iteritems() for items().

Bruno Bronosky
  • 66,273
  • 12
  • 162
  • 149
Gareth Latty
  • 86,389
  • 17
  • 178
  • 183
  • 1
    I like your answer, and this may be a little nitpicky, but it reads betters as `if value is not None`. – Brian Hicks May 04 '12 at 17:42
  • @BrianHicks Noted and fixed, very true. – Gareth Latty May 04 '12 at 17:43
  • I guess if we're trying to be more pythonic, you should either use `if hasattr(item, 'items')` or just assume it's being passed a dictionary. You can pass `[1, 2, None]` to the function as is and get back the same item with nones. – Brian Hicks May 04 '12 at 17:45
  • @BrianHicks I don't think the intention was to remove `None` from lists - I presumed this was purely to remove keys with values of `None`. `hasattr(item, 'items')` is probably a little better too, this is true. – Gareth Latty May 04 '12 at 17:51
  • @Lattyware: Why does the link 'dict comprehension' link to a YouTube video about list comprehension? Because it's close enough? – benregn Dec 11 '12 at 17:16
  • @benregn The video goes on to talk about other types of comprehensions and generator expressions. – Gareth Latty Dec 11 '12 at 17:30
  • How about a version that works in Python 2.6 (long story)? The last line has a syntax error in 2.6, which I assume was some change in how comprehensions worked in 2.7. – machomeautoguy Apr 20 '16 at 22:50
  • @machomeautoguy dictionary comprehensions didn't exist before 2.7 - you can replace it with a call to `dict()` with a generator expression producing tuples, I.e: `dict((key, value) for x in y)` – Gareth Latty Apr 21 '16 at 06:57
0

I really appreciate the answer by @Lattyware. It helped me filter out a nested object and remove empty values regardless of type being dict, list, or str.

Here is what I came up with:

remove-keys-with-empty-values.py

# remove-keys-with-empty-values.py
from pprint import pprint

def remove_keys_with_empty_values(item):
  if hasattr(item, 'items'):
    return {key: remove_keys_with_empty_values(value) for key, value in item.items() if value==0 or value}
  elif isinstance(item, list):
    return [remove_keys_with_empty_values(value) for value in item if value==0 or value]
  else:
    return item

d = {
     'string': 'value',
     'integer': 10,
     'float': 0.5,
     'zero': 0,
     'empty_list': [],
     'empty_dict': {},
     'empty_string': '',
     'none': None,
    }

d['nested_dict'] = d.copy()
l = d.values()
d['nested_list'] = l

pprint({
  "DICT FILTERED": remove_keys_with_empty_values(d),
  "DICT ORIGINAL": d,
  "LIST FILTERED": remove_keys_with_empty_values(l),
  "LIST ORIGINAL": l,
})

execution

python remove-keys-with-empty-values.py
    {'DICT FILTERED': {'float': 0.5,
                       'integer': 10,
                       'nested_dict': {'float': 0.5,
                                       'integer': 10,
                                       'string': 'value',
                                       'zero': 0},
                       'nested_list': [0,
                                       'value',
                                       10,
                                       0.5,
                                       {'float': 0.5,
                                        'integer': 10,
                                        'string': 'value',
                                        'zero': 0}],
                       'string': 'value',
                       'zero': 0},
     'DICT ORIGINAL': {'empty_dict': {},
                       'empty_list': [],
                       'empty_string': '',
                       'float': 0.5,
                       'integer': 10,
                       'nested_dict': {'empty_dict': {},
                                       'empty_list': [],
                                       'empty_string': '',
                                       'float': 0.5,
                                       'integer': 10,
                                       'none': None,
                                       'string': 'value',
                                       'zero': 0},
                       'nested_list': [{},
                                       0,
                                       'value',
                                       None,
                                       [],
                                       10,
                                       0.5,
                                       '',
                                       {'empty_dict': {},
                                        'empty_list': [],
                                        'empty_string': '',
                                        'float': 0.5,
                                        'integer': 10,
                                        'none': None,
                                        'string': 'value',
                                        'zero': 0}],
                       'none': None,
                       'string': 'value',
                       'zero': 0},
     'LIST FILTERED': [0,
                       'value',
                       10,
                       0.5,
                       {'float': 0.5,
                        'integer': 10,
                        'string': 'value',
                        'zero': 0}],
     'LIST ORIGINAL': [{},
                       0,
                       'value',
                       None,
                       [],
                       10,
                       0.5,
                       '',
                       {'empty_dict': {},
                        'empty_list': [],
                        'empty_string': '',
                        'float': 0.5,
                        'integer': 10,
                        'none': None,
                        'string': 'value',
                        'zero': 0}]}
Bruno Bronosky
  • 66,273
  • 12
  • 162
  • 149