1

I have a piece of python code, which need to iterator over all keys in a dictionary where the body of the loop might change the dictionary.

Just trying to iterator over the dictionary give me RuntimeError: dictionary changed size during iteration. Searching for the error message lead me to another question, where the suggested solution was to copy the list of keys before starting the iteration.

In my scenario that is however only a partial answer, because I need to iterate over not only the original keys but also over keys that are added while I am iterating. I need code that only terminates once no more keys are being added to the dictionary, and it has processed any keys already in the dictionary.

So far I have come up with this:

processed = set()
while True:
    keys = set(dictionary) - processed
    if not keys:
        break
    for k in keys:
        processed.add(k)
        # Do something with k

This approach does seem overly complicated. Is there any way I could simplify it, and still have it process all of the keys being added to the dictionary without processing each of them more than once?

In some other languages the first four lines of the while loop could have been written simply as:

while keys = set(dictionary) - processed:

However that doesn't work in python. I found a question about using an assignment as condition in a while loop in python. But none of the proposed answers seem applicable to my scenario.

Community
  • 1
  • 1
kasperd
  • 1,952
  • 1
  • 20
  • 31

3 Answers3

2

Consider the opposite approach. Track the keys that still need to be processed, rather than the ones that already have been.

remaining = set(dictionary.items())

while remaining:
    key, value = remaining.pop()

    # process the item

    if item_to_add:
        dictionary[new_key] = new_value
        remaining.add((new_key, new_value))

This avoids an expensive set difference operation each iteration.

If you really don't know what keys have been added during processing, then the code in your question is good as is. A slightly different way to write it would be:

keys = set(dictionary)
processed = set()

while keys:
    for k in keys:
        # Do something with k

    processed.update(keys)
    keys = set(dictionary) - processed

Is that better? Worse? For you to decide, I suppose.

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
  • The body of my loop doesn't explicitly add keys to the dictionary. Rather it calls another function, which may or may not add keys to the dictionary. So the only way the body of my loop could know which keys have been added would be by looking at the current set of keys. – kasperd Nov 21 '14 at 03:58
  • Re-inspecting the dictionary repeatedly to see what's changed could be costly. Is there any way to refactor the code so you know what items are added? If not, then your original code looks good to me. – John Kugelman Nov 21 '14 at 04:00
  • It is not going to be practical for me to refactor the code in order to let the body of the loop know what is being added. In my case performance of the set difference is not going to be an issue, since the cost of processing a single key is much higher than that of computing the set difference. – kasperd Nov 21 '14 at 04:04
  • Thanks for providing a different approach to solving the problem. If there is no completely general solution, it is good to have multiple ideas that each may be applicable in different scenarios. – kasperd Nov 21 '14 at 04:09
2

Perhaps you can use an OrderedDict instead. No problem adding keys while iterating

>>> from collections import OrderedDict
>>> d = OrderedDict([(1, 2), (3, 4)])
>>> for k in d:
...     if k == 1:d[5] = 6
...     print(k)
... 
1
3
5

Are you updating keys or just adding them? If updating, you'll need to specify whether the key should be processed again or not

John La Rooy
  • 295,403
  • 53
  • 369
  • 502
  • This sounds like it may possibly be a cleaner solution. The documentation states that iteration will process keys in the order in which they were added. It does however not state if keys added while iterating are guaranteed to be processed before the iteration finishes. I don't change existing keys, I only add and remove keys. – kasperd Nov 21 '14 at 04:25
0

I rewrote my code as:

processed = set()
while set(dictionary) - processed:
    for k in set(dictionary) - processed:
        processed.add(k)
        # Do something with k

This does duplicate the set(dictionary) - processed expression, but still makes the code easier to read.

kasperd
  • 1,952
  • 1
  • 20
  • 31