1

I am using Python 2.7 with Windows 7.

I have a dictionary and would like to remove values that correspond to (key, value) pairs from another dictionary.

For example, I have a dictionary t_dict. I would like to remove corresponding (key,value) pairs that are in the dictionary values_to_remove so that I end up with dictionary final_dict

t_dict = {
    'a': ['zoo', 'foo', 'bar'],
    'c': ['zoo', 'foo', 'yum'],
    'b': ['tee', 'dol', 'bar']
}

values_to_remove = {
    'a': ['zoo'],
    'b': ['dol', 'bar']
}

# remove values here

print final_dict
{
    'a': ['foo', 'bar'],
    'c': ['zoo', 'foo', 'yum'],
    'b': ['tee']
}

I have looked at similar pages on SO and the python dictionaries doc but cannot find anything to solve this specific problem:

https://docs.python.org/2/library/stdtypes.html#dict

How to remove dictionaries with duplicate values from a nested dictionary

How to remove a key from a python dictionary?

EDIT

There cannot be duplicate values in t_dict per key. For example there will never be

t_dict['a'] = ['zoo','zoo','foo','bar']

Community
  • 1
  • 1
BeeGee
  • 815
  • 2
  • 17
  • 33

4 Answers4

5

Try this,

for k, v in t_dict.items():
    for item in values_to_remove.get(k, ()):
        v.remove(item) 

# Output
{'a': ['foo', 'bar'], 'c': ['zoo', 'foo', 'yum'], 'b': ['tee']}
SparkAndShine
  • 17,001
  • 22
  • 90
  • 134
  • 2
    Shouldn't you loop inside the `if` to remove all of the values associated with `k` in `values_to_remove`? Or to simplify the code by combining check and iteration, `for k, v in t_dict.items():`, `for toremove in values_to_remove.get(k, ()):`, `v.remove(toremove)` (which also avoids needlessly looking up `t_dict[k]` as second time; you already got it from the `items` iteration). – ShadowRanger May 11 '16 at 17:32
3

Since duplicates aren't possible, it might make sense to store the values as a set, not a list. If you can use a set for t_dict, the removal process is both faster and simpler to write (even faster if values_to_remove uses set or frozenset too):

for k, toremove in values_to_remove.viewitems():
    t_dict.get(k, set()).difference_update(toremove)

Use the above if values_to_remove expected to be small, or if t_dict is smaller, you could switch to the following to avoid the temporary set()s (the empty tuple is a singleton, so it costs nothing to use it with dict.get):

for k, v in t_dict.viewitems():
    v.difference_update(values_to_remove.get(k, ()))

Final option is the overly clever approach that removes the need for using .get at all by only processing keys that appear in both dicts (using -= requires both dicts to use set for values to be shorter/faster, you could go back to difference_update if you want to allow non-sets for values_to_remove's values):

for k in (t_dict.viewkeys() & values_to_remove.viewkeys()):
    t_dict[k] -= values_to_remove[k]
ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
  • 1
    Use `OrderedSet` if the order is important? – SparkAndShine May 11 '16 at 17:55
  • 1
    @sparkandshine: `OrderedSet` isn't part of the Python standard library (it only has `collections.OrderedDict` which can be used to emulate most `set` behaviors _except_ the set-specific operation I'm using here), but yeah, `OrderedSet` is an option if you're okay with the additional dependency. – ShadowRanger May 11 '16 at 21:42
2
for key,values in values_to_remove.items():
    for value in values:
        if key in t_dict and value in t_dict[key]:
            t_dict[key].pop(t_dict[key].index(value))
J.J
  • 3,459
  • 1
  • 29
  • 35
  • Does not meet the requirements. You don't throw out the whole list, you pop the matching elements. – Two-Bit Alchemist May 11 '16 at 17:22
  • There weren't lists in the original question. Thanks for the downvote though – J.J May 11 '16 at 17:25
  • 1
    That would be a great excuse if StackOverflow didn't keep an edit history where I could easily see that the only thing that's been edited is the question title. – Two-Bit Alchemist May 11 '16 at 17:26
  • I meant first *question* asked in the OP, not the first *version* of the OP. – J.J May 11 '16 at 17:35
  • @Two-BitAlchemist: Not necessarily. Edits made before answers/comments are posted or before votes occur usually don't show up in the history. So you could load a question, compose answer, and while composing, question is edited "silently". No idea if that's the case here, but it could be. – ShadowRanger May 11 '16 at 17:36
  • @ShadowRanger Not that the most recent edit of the comment before yours doesn't make it pretty clear this answerer just didn't read the whole OP, but I've been here since 55 seconds after this was asked. There have always been lists. – Two-Bit Alchemist May 11 '16 at 17:44
1

When you don't want to have the duplicated elements in your dict, and the order also not so important, why don't use set as you dic value?

t_dict = {
    'a': set(['zoo', 'foo', 'bar']),
    'c': set(['zoo', 'foo', 'yum']),
    'b': set(['tee', 'dol', 'bar'])
}

values_to_remove = {
    'a': set(['zoo']),
    'b': set(['dol', 'bar'])
}

for k,v in values_to_remove.iteritems():
    t_dict[k] = t_dict[k]-v

print t_dict

>>>{'a': set(['foo', 'bar']), 'c': set(['foo', 'yum', 'zoo']), 'b': set(['tee'])}

If the Order important for you, you can also use the OrderedSet like @sparkandshine in comment suggested. http://orderedset.readthedocs.io/en/latest/

from ordered_set import OrderedSet
t_dict = {
    'a': OrderedSet(['zoo', 'foo', 'bar']),
    'c': OrderedSet(['zoo', 'foo', 'yum']),
    'b': OrderedSet(['tee', 'dol', 'bar'])
}

values_to_remove = {
    'a': OrderedSet(['zoo']),
    'b': OrderedSet(['dol', 'bar'])
}

for k,v in values_to_remove.iteritems():
    t_dict[k] = t_dict[k]-v

print t_dict

>>>{'a': OrderedSet(['foo', 'bar']), 'c': OrderedSet(['zoo', 'foo', 'yum']), 'b': OrderedSet(['tee'])}
xirururu
  • 5,028
  • 9
  • 35
  • 64