5

I'm using kendoUI Grid in one of my projects. I retrieved a piece of data using their api and found that it added some "unwanted" data to my json/dictionary. After passing the json back to my Pyramid backend, I need to remove these keys. The problem is, the dictionary can be of whatever depth and I don't know the depth in advance.

Example:

product = {
    id: "PR_12"
    name: "Blue shirt",
    description: "Flowery shirt for boys above 2 years old",
    _event: {<some unwanted data here>},
    length: <some unwanted data>,
    items: [{_event: {<some rubbish data>}, length: <more rubbish>, price: 23.30, quantity: 34, color: "Red", size: "Large"}, {_event: {<some more rubbish data>}, length: <even more rubbish>, price: 34.50, quantity: 20, color: "Blue", size: "Large"} ....]
}

I want to remove two keys in particular: "_event" & "length". I tried writing a recursive function to remove the data but I can't seem to get it right. Can someone please help?

Here's what I have:

def remove_specific_key(the_dict, rubbish):
  for key in the_dict:
    if key == rubbish:
      the_dict.pop(key)
    else:
      # check for rubbish in sub dict
      if isinstance(the_dict[key], dict):
        remove_specific_key(the_dict[key], rubbish)

      # check for existence of rubbish in lists
      elif isinstance(the_dict[key], list):
        for item in the_dict[key]:
          if item == rubbish:
            the_dict[key].remove(item)
   return the_dict
Mark
  • 2,137
  • 4
  • 27
  • 42
  • You have to deepcopy the list at each iteration, otherwise you'll be modifying the object on which you iterate, which will eventually result in very unexpected results. – luke14free Apr 16 '12 at 17:54
  • You might find the answers to [deleting items from a dictionary while iterating over it](http://stackoverflow.com/questions/5384914/deleting-items-from-a-dictionary-while-iterating-over-it) helpful. – James Apr 16 '12 at 17:56

3 Answers3

8

If you allow remove_specific_key (renamed remove_keys) to accept any object as its first argument, then you can simplify the code:

def remove_keys(obj, rubbish):
    if isinstance(obj, dict):
        obj = {
            key: remove_keys(value, rubbish) 
            for key, value in obj.iteritems()
            if key not in rubbish}
    elif isinstance(obj, list):
        obj = [remove_keys(item, rubbish)
                  for item in obj
                  if item not in rubbish]
    return obj

Since you wish to remove more than one key, you might as well let rubbish be a set instead of one particular key. With the above code, you'd remove '_event' and 'length' keys with

product = remove_keys(product, set(['_event', 'length']))

Edit: remove_key uses dict comprehension, introduced in Python2.7. For older version of Python, the equivalent would be

    obj = dict((key, remove_keys(value, rubbish))
               for key, value in obj.iteritems()
               if key not in rubbish)
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • I am getting an error pointing at the first if statement at the "for key, value" part – Mark Apr 17 '12 at 02:59
  • 1
    @Mark, perhaps you are using an older version of Python? I've edited the post to show how to form `obj` without the "dict comprehension". If that's not the problem, please post the full error message. – unutbu Apr 17 '12 at 10:09
5

Modifying a dict as you iterate it bad, an unnecessary, since you know exactly what key you are looking for. Also, your list of dicts aren't being handled right:

def remove_specific_key(the_dict, rubbish):
    if rubbish in the_dict:
        del the_dict[rubbish]
    for key, value in the_dict.items():
        # check for rubbish in sub dict
        if isinstance(value, dict):
            remove_specific_key(value, rubbish)

        # check for existence of rubbish in lists
        elif isinstance(value, list):
            for item in value:
                if isinstance(item, dict):
                    remove_specific_key(item, rubbish)
Ned Batchelder
  • 364,293
  • 75
  • 561
  • 662
  • Should probably use `collections.MutableMapping` and `collections.Container`. – Katriel Apr 16 '12 at 17:58
  • I'm taking him at his word that his data is truly lists and dicts. – Ned Batchelder Apr 16 '12 at 17:59
  • In python 3.x, I think you would have to use `list(the_dict.items())` or similar. – James Apr 16 '12 at 18:00
  • @James: in python 3.x, d.items() is iterable, no need to modify the code here at all. – Ned Batchelder Apr 16 '12 at 18:09
  • If there is a subdict buried inside a list of lists, then this code will not remove rubbish. – unutbu Apr 16 '12 at 18:34
  • @NedBatchelder: sorry, I should explained what I meant. In python 3, `items` has roughly the same behaviour as `iteritems` does in python 2. The python 3 equivalent of `the_dict.items()` is `list(the_dict.items())`. This creates a copy of the dictionary's contents, so you don't end up with a `RuntimeError: dictionary changed size during iteration`. – James Apr 17 '12 at 11:17
1

dict or list can not be delete while iteratering, so replace the iterator with a test function.

def remove_specific_key(the_dict, rubbish):
    if the_dict.has_key(rubbish):
        the_dict.pop(rubbish)
    else:
        for key in the_dict:
            if isinstance(the_dict[key], dict):
                remove_specific_key(the_dict[key], rubbish)
            elif isinstance(the_dict[key], list):
                if the_dict[key].count(rubbish):
                    the_dict[key].remove(rubbish)
    return the_dict


d = {"a": {"aa": "foobar"}}
remove_specific_key(d, "aa")
print d

d = {"a": ["aa", "foobar"]}
remove_specific_key(d, "aa")
print d
tuoxie007
  • 1,234
  • 12
  • 10