Removing multiple keys from a dictionary safely

Question

I know how to remove an entry, 'key' from my dictionary d, safely. You do:

if d.has_key('key'):
    del d['key']

However, I need to remove multiple entries from a dictionary safely. I was thinking of defining the entries in a tuple as I will need to do this more than once.

entities_to_remove = ('a', 'b', 'c')
for x in entities_to_remove:
    if x in d:
        del d[x]

However, I was wondering if there is a smarter way to do this?

Retrieval time from a dictionary is nearly O(1) because of hashing. Unless you are removing a significant proportion of the entries, I don't think you will do much better. — ncmathsadist, Jan 24 '12 at 23:22
The answer of @mattbornski seems more canonical, and also succincter. — 0 _, Jul 10 '15 at 09:11
StackOverflow hath spoken: `key in d` is more Pythonic than `d.has_key(key)` https://stackoverflow.com/questions/1323410/has-key-or-in — Michael Scheper, Jul 26 '17 at 22:11
If you can spare a bit of memory, you can do `for x in set(d) & entities_to_remove: del d[x]`. This will probably only be more efficient if `entities_to_remove` is "large". — DylanYoung, Apr 15 '20 at 15:32

score 353 · Answer 1 · edited Aug 27 '20 at 14:33

353

Using dict.pop:

d = {'some': 'data'}
entries_to_remove = ('any', 'iterable')
for k in entries_to_remove:
    d.pop(k, None)

edited Aug 27 '20 at 14:33

Boris Verkhovskiy

14,854
11
100
103

answered Jan 24 '12 at 23:22

mattbornski

11,895
4
31
25

59

**This.** This is the clever Pythonista's choice. `dict.pop()` eliminates the need for key existence testing. _Excellent._ – Cecil Curry Mar 11 '16 at 01:51
8

For what it's worth, I think `.pop()` is bad and unpythonic, and would prefer the accepted answer over this one. – Arne Mar 14 '18 at 16:03
8

A staggering number of people appear unbothered by this :) I don't mind the extra line for existence checking personally, and it's significantly more readable unless you already know about pop(). On the other hand if you were trying to do this in a comprehension or inline lambda this trick could be a big help. I'd also say that it's important, in my opinion, to meet people where they are. I'm not sure that "bad and unpythonic" is going to give the people who are reading these answers the practical guidance they are looking for. – mattbornski Mar 14 '18 at 17:14
11

There is a particularly *good* reason to use this. While adding an extra line may improve "readability" or "clarity", it also adds an extra lookup to the dictionary. This method is the removal equivalent of doing `setdefault`. If implemented correctly (and I'm sure it is), it only does one lookup into the hash-map that is the `dict`, instead of two. – Mad Physicist Jun 08 '18 at 21:22
I'm not very sure, but `k in d` + `del d[k]` may still be faster than `d.pop(k)` because `pop` may involve a function call, while `in` and `del` somehow may not. BTW, in my experience, `k in d` + `d[k]` in CPython is usually faster than `d.get(k)`. There are some articles/questions talking about this. – Ian Lin Aug 26 '19 at 02:54
4

Personally I would be concerned with correctness and maintainability first, and speed only if it is proven to be insufficiently fast. The speed difference between these operations is going to be trivial when zoomed out to the application level. It may be the case that one is faster, but I expect that in real world usage you will neither notice nor care, and if you do notice and care, you will be better served rewriting in something more performant than Python. – mattbornski Aug 26 '19 at 21:01
3

This is not "bad and unpythonic". `pop` is very common and very pythonic. Alternately, you could just try/except, but that's what is already built into pop. Carry on. – Jordan Oct 04 '19 at 19:47
3

Reminder - if you're testing this and using just one key, you need to do like: entriesToRemove('any',). (with that extra ',' at the end). If you forget the comma, it will instead test with each letter. – JohnFlux Jun 17 '20 at 01:14

score 123 · Answer 2 · edited Jun 15 '23 at 20:02

123

Using Dict Comprehensions

final_dict = {key: value for key, value in d.items() if key not in [key1, key2]}

where key1 and key2 are to be removed.

In the example below, keys "b" and "c" are to be removed & it's kept in a keys list.

>>> a
{'a': 1, 'c': 3, 'b': 2, 'd': 4}
>>> keys = ["b", "c"]
>>> print {key: a[key] for key in a if key not in keys}
{'a': 1, 'd': 4}
>>>

edited Jun 15 '23 at 20:02

Frank

1,565
1
10
9

answered Jan 24 '12 at 23:25

shadyabhi

16,675
26
80
131

4

new dictionary? list comprehension? You should adjust the answer to the person asking the question ;) – Glaslos Jan 24 '12 at 23:34
it creates a new dictionary but it was a one liner, so I mentioned it. – shadyabhi Jan 24 '12 at 23:37
7

This solution has a serious performance hit when the variable holding the has further use in the program. In other words, a dict from which keys have been deleted is much more efficient than a newly created dict with the retained items. – Apalala Mar 29 '13 at 18:08
2

@shadyabhi. beautiful, very pythonic ! One often forget that optimization and side effect are evil.... – Frederic Bazin Jul 26 '15 at 16:22
22

for the sake of readability, I suggest {k:v for k,v in t.items() if k not in [key1, key2]} – Frederic Bazin Jul 26 '15 at 16:23
9

This also has performance issues when the list of keys is too big, as searches take `O(n)`. The whole operation is `O(mn)`, where `m` is the number of keys in the dict and `n` the number of keys in the list. I suggest using a set `{key1, key2}` instead, if possible. – ldavid Oct 04 '15 at 01:17
4

To Apalala: can you help me understand why there's an performance hit? – Sean Oct 26 '16 at 05:31
2

@Sean The program needs to allocate memory for a second dictionary, iterate through every key on the first dictionary and do a full-value compare (as opposed to a hash compare), check if its in the 'keys' list by iterating through that list every pass of the comprehension, then copy to the second dictionary if not. By deleting, all that is happening is the first dictionary looks up the key's hash value and removes it. Then the garbage collector takes care of the rest. Therefore this is much faster. – navigator_ Apr 06 '17 at 18:06
Removal is better than reconstruction of an entire dict. Complexity is of also order of N. Not really pythonic way of doing – Aditya Apr 12 '19 at 20:34
1

There's and error in the first example. It needs to read: ``for key, value in d.items()``. – jcoffland Jan 14 '23 at 14:14

Glaslos · Accepted Answer · 2018-01-16T08:22:04.987

77

Why not like this:

entries = ('a', 'b', 'c')
the_dict = {'b': 'foo'}

def entries_to_remove(entries, the_dict):
    for key in entries:
        if key in the_dict:
            del the_dict[key]

A more compact version was provided by mattbornski using dict.pop()

edited Jan 16 '18 at 08:22

answered Jan 24 '12 at 23:20

Glaslos

2,872
1
21
31

23

Adding this for people coming from a search engine. If keys are known (when safety is not an issue), multiple keys can be deleted in one line like this `del dict['key1'], dict['key2'], dict['key3']` – Tirtha R May 03 '18 at 23:24
2

Depending on the number of keys you're deleting, it might be more efficient to use `for key in set(the_dict) & entries:` and bypass the `key in dict` test. – DylanYoung Apr 15 '20 at 15:30

Jose Ricardo Bustos M. · Answer 4 · 2016-11-28T20:26:11.507

23

a solution is using map and filter functions

python 2

d={"a":1,"b":2,"c":3}
l=("a","b","d")
map(d.__delitem__, filter(d.__contains__,l))
print(d)

python 3

d={"a":1,"b":2,"c":3}
l=("a","b","d")
list(map(d.__delitem__, filter(d.__contains__,l)))
print(d)

you get:

{'c': 3}

edited Nov 28 '16 at 20:26

answered May 20 '15 at 13:35

Jose Ricardo Bustos M.

8,016
6
40
62

This doesn't work for me with python 3.4: `>>> d={"a":1,"b":2,"c":3} >>> l=("a","b","d") >>> map(d.__delitem__, filter(d.__contains__,l)) >>> print(d) {'a': 1, 'b': 2, 'c': 3}` – Risadinha Jun 14 '15 at 12:27
@Risadinha `list(map(d.__delitem__,filter(d.__contains__,l)))` .... in python 3.4 map function return a iterator – Jose Ricardo Bustos M. Jun 15 '15 at 00:28
4

or `deque(map(...), maxlen=0)` to avoid building a list of None values; first import with `from collections import deque` – Jason Jun 06 '16 at 02:14

score 21 · Answer 5 · edited Aug 27 '20 at 14:29

21

If you also need to retrieve the values for the keys you are removing, this would be a pretty good way to do it:

values_removed = [d.pop(k, None) for k in entities_to_remove]

You could of course still do this just for the removal of the keys from d, but you would be unnecessarily creating the list of values with the list comprehension. It is also a little unclear to use a list comprehension just for the function's side effect.

edited Aug 27 '20 at 14:29

Boris Verkhovskiy

14,854
11
100
103

answered Jan 24 '12 at 23:27

Andrew Clark

202,379
35
273
306

5

Or if you wanted to keep the deleted entries *as a dictionary:* `valuesRemoved = dict((k, d.pop(k, None)) for k in entitiesToRemove)` and so on. – kindall Jan 25 '12 at 00:07
You can leave away the assignment to a variable. In this or that way it's the shortest and most pythonic solution and should be marked as the corect answer IMHO. – Johann Hagerer Nov 16 '15 at 09:31

score 18 · Answer 6 · edited Feb 07 '23 at 17:43

18

Found a solution with pop and map

d = {'a': 'valueA', 'b': 'valueB', 'c': 'valueC', 'd': 'valueD'}
keys = ['a', 'b', 'c']
list(map(d.pop, keys))
print(d)

The output of this:

{'d': 'valueD'}

I have answered this question so late just because I think it will help in the future if anyone searches the same. And this might help.

Update

The above code will throw an error if a key does not exist in the dict.

DICTIONARY = {'a': 'valueA', 'b': 'valueB', 'c': 'valueC', 'd': 'valueD'}
keys = ['a', 'l', 'c']

def remove_key(key):
    DICTIONARY.pop(key, None)
    

list(map(remove_key, keys))
print(DICTIONARY)

output:

DICTIONARY = {'b': 'valueB', 'd': 'valueD'}

edited Feb 07 '23 at 17:43

Cornelius Roemer

3,772
1
24
55

answered Jan 24 '19 at 06:07

Shubham Srivastava

1,190
14
28

1

This answer will throw an exception if any key in `keys` does not exist in `d` - you would have to filter that first. – ingofreyer Feb 28 '19 at 07:53
@ingofreyer updated the code for exception handling. Thanks for finding this issue. I think now it will work. :) – Shubham Srivastava Mar 07 '19 at 06:40
Thanks, this should help everyone finding this answer :-) – ingofreyer Mar 07 '19 at 06:47
Creating a list as a by-product of using map, makes this quite slow, it's actually better to loop over it. – Charlie Clark May 28 '20 at 16:52

Erik Aronesty · Answer 7 · 2020-06-08T11:52:07.153

Some timing tests for cpython 3 shows that a simple for loop is the fastest way, and it's quite readable. Adding in a function doesn't cause much overhead either:

timeit results (10k iterations):

all(x.pop(v) for v in r) # 0.85
all(map(x.pop, r)) # 0.60
list(map(x.pop, r)) # 0.70
all(map(x.__delitem__, r)) # 0.44
del_all(x, r) # 0.40
<inline for loop>(x, r) # 0.35

def del_all(mapping, to_remove):
      """Remove list of elements from mapping."""
      for key in to_remove:
          del mapping[key]

For small iterations, doing that 'inline' was a bit faster, because of the overhead of the function call. But del_all is lint-safe, reusable, and faster than all the python comprehension and mapping constructs.

Good function. Just make sure to handle the exception if a key was not found. — mika, Dec 12 '22 at 14:26

score 7 · Answer 8 · edited May 23 '17 at 10:31

7

I have no problem with any of the existing answers, but I was surprised to not find this solution:

keys_to_remove = ['a', 'b', 'c']
my_dict = {k: v for k, v in zip("a b c d e f g".split(' '), [0, 1, 2, 3, 4, 5, 6])}

for k in keys_to_remove:
    try:
        del my_dict[k]
    except KeyError:
        pass

assert my_dict == {'d': 3, 'e': 4, 'f': 5, 'g': 6}

Note: I stumbled across this question coming from here. And my answer is related to this answer.

edited May 23 '17 at 10:31

Community

1
1

answered May 20 '15 at 13:37

Deacon

3,615
2
31
52

1

Checking for membership is easier (and faster) than running try: except: – Charlie Clark Nov 25 '22 at 12:50

kolypto · Answer 9 · 2021-01-09T21:57:57.147

7

I have tested the performance of three methods:

# Method 1: `del`
for key in remove_keys:
    if key in d:
        del d[key]

# Method 2: `pop()`
for key in remove_keys:
    d.pop(key, None)

# Method 3: comprehension
{key: v for key, v in d.items() if key not in remove_keys}

Here are the results of 1M iterations:

del: 2.03s 2.0 ns/iter (100%)
pop(): 2.38s 2.4 ns/iter (117%)
comprehension: 4.11s 4.1 ns/iter (202%)

So both del and pop() are the fastest. Comprehensions are 2x slower. But anyway, we speak nanoseconds here :) Dicts in Python are ridiculously fast.

edited Jan 09 '21 at 21:57

answered Sep 17 '20 at 14:30

kolypto

31,774
17
105
99

1

You forgot an `t` in the `no` word – igorkf Oct 07 '20 at 12:08

L3viathan · Answer 10 · 2013-09-30T14:05:42.807

3

Why not:

entriestoremove = (2,5,1)
for e in entriestoremove:
    if d.has_key(e):
        del d[e]

I don't know what you mean by "smarter way". Surely there are other ways, maybe with dictionary comprehensions:

entriestoremove = (2,5,1)
newdict = {x for x in d if x not in entriestoremove}

edited Sep 30 '13 at 14:05

answered Jan 24 '12 at 23:20

L3viathan

26,748
2
58
81

score 1 · Answer 11 · answered Oct 27 '16 at 05:14

inline

import functools

#: not key(c) in d
d = {"a": "avalue", "b": "bvalue", "d": "dvalue"}

entitiesToREmove = ('a', 'b', 'c')

#: python2
map(lambda x: functools.partial(d.pop, x, None)(), entitiesToREmove)

#: python3

list(map(lambda x: functools.partial(d.pop, x, None)(), entitiesToREmove))

print(d)
# output: {'d': 'dvalue'}

score 1 · Answer 12 · answered May 28 '20 at 17:17

It would be nice to have full support for set methods for dictionaries (and not the unholy mess we're getting with Python 3.9) so that you could simply "remove" a set of keys. However, as long as that's not the case, and you have a large dictionary with potentially a large number of keys to remove, you might want to know about the performance. So, I've created some code that creates something large enough for meaningful comparisons: a 100,000 x 1000 matrix, so 10,000,00 items in total.

from itertools import product
from time import perf_counter

# make a complete worksheet 100000 * 1000
start = perf_counter()
prod = product(range(1, 100000), range(1, 1000))
cells = {(x,y):x for x,y in prod}
print(len(cells))

print(f"Create time {perf_counter()-start:.2f}s")
clock = perf_counter()
# remove everything above row 50,000

keys = product(range(50000, 100000), range(1, 100))

# for x,y in keys:
#     del cells[x, y]

for n in map(cells.pop, keys):
    pass

print(len(cells))
stop = perf_counter()
print(f"Removal time {stop-clock:.2f}s")

10 million items or more is not unusual in some settings. Comparing the two methods on my local machine I see a slight improvement when using map and pop, presumably because of fewer function calls, but both take around 2.5s on my machine. But this pales in comparison to the time required to create the dictionary in the first place (55s), or including checks within the loop. If this is likely then its best to create a set that is a intersection of the dictionary keys and your filter:

keys = cells.keys() & keys

In summary: del is already heavily optimised, so don't worry about using it.

score 0 · Answer 13 · answered Mar 11 '19 at 15:38

I think using the fact that the keys can be treated as a set is the nicest way if you're on python 3:

def remove_keys(d, keys):
    to_remove = set(keys)
    filtered_keys = d.keys() - to_remove
    filtered_values = map(d.get, filtered_keys)
    return dict(zip(filtered_keys, filtered_values))

Example:

>>> remove_keys({'k1': 1, 'k3': 3}, ['k1', 'k2'])
{'k3': 3}

Sergey Novozhilov · Answer 14 · 2021-02-04T15:37:58.683

Another map() way to remove list of keys from dictionary

and avoid raising KeyError exception

    dic = {
        'key1': 1,
        'key2': 2,
        'key3': 3,
        'key4': 4,
        'key5': 5,
    }
    
keys_to_remove = ['key_not_exist', 'key1', 'key2', 'key3']
k = list(map(dic.pop, keys_to_remove, keys_to_remove))

print('k=', k)
print('dic after =  \n', dic)

**this will produce output** 

k= ['key_not_exist', 1, 2, 3]
dic after =  {'key4': 4, 'key5': 5}

Duplicate keys_to_remove is artificial, it needs to supply defaults values for dict.pop() function. You can add here any array with len_ = len(key_to_remove)

For example

dic = {
    'key1': 1,
    'key2': 2,
    'key3': 3,
    'key4': 4,
    'key5': 5,
}

keys_to_remove = ['key_not_exist', 'key1', 'key2', 'key3']    
k = list(map(dic.pop, keys_to_remove, np.zeros(len(keys_to_remove))))

print('k=', k)
print('dic after = ', dic)

** will produce output **

k= [0.0, 1, 2, 3]
dic after =  {'key4': 4, 'key5': 5}

score 0 · Answer 15 · answered Jun 30 '21 at 09:18

def delete_keys_from_dict(dictionary, keys):
"""
Deletes the unwanted keys in the dictionary
:param dictionary: dict
:param keys: list of keys
:return: dict (modified)
"""
from collections.abc import MutableMapping

keys_set = set(keys)
modified_dict = {}
for key, value in dictionary.items():
    if key not in keys_set:
        if isinstance(value, list):
            modified_dict[key] = list()
            for x in value:
                if isinstance(x, MutableMapping):
                    modified_dict[key].append(delete_keys_from_dict(x, keys_set))
                else:
                    modified_dict[key].append(x)
        elif isinstance(value, MutableMapping):
            modified_dict[key] = delete_keys_from_dict(value, keys_set)
        else:
            modified_dict[key] = value
return modified_dict


_d = {'a': 1245, 'b': 1234325, 'c': {'a': 1245, 'b': 1234325}, 'd': 98765,
      'e': [{'a': 1245, 'b': 1234325},
            {'a': 1245, 'b': 1234325},
            {'t': 767}]}

_output = delete_keys_from_dict(_d, ['a', 'b'])
_expected = {'c': {}, 'd': 98765, 'e': [{}, {}, {'t': 767}]}
print(_expected)
print(_output)

score -4 · Answer 16 · answered May 16 '18 at 22:55

-4

I'm late to this discussion but for anyone else. A solution may be to create a list of keys as such.

k = ['a','b','c','d']

Then use pop() in a list comprehension, or for loop, to iterate over the keys and pop one at a time as such.

new_dictionary = [dictionary.pop(x, 'n/a') for x in k]

The 'n/a' is in case the key does not exist, a default value needs to be returned.

answered May 16 '18 at 22:55

Terrance DeJesus

221
3
6

9

`new_dictionary` looks an awful lot like a list ;) – DylanYoung Dec 05 '18 at 15:18

Removing multiple keys from a dictionary safely

16 Answers16

Linked

Related