2

I have dict, say for example this

data={k:k for k in range(20)}

I do some operation over the values of data and some of the en up as 0, for example this

for k,v in data.items():
    data[k] %= 2

when doing this I want to remove all key that get a value of 0, but doing in the fly give a error so I have to do it in at the end, for that I do

def clean(data):
    while True:
        try:
            for k,v in data.items():
                if not v:
                    del data[k]
            return
        except RuntimeError:
            pass

so my question is: there is a better way of doing this so I make the remotion in-place and avoiding using extra memory and better yet in one trip ??

EDIT

this is similar to my intended use

class MapDict(dict):

    def __repr__(self):
        return '{}({})'.format(self.__class__.__qualname__, super().__repr__())

    def map(self,func,*argv):
        '''applicate func to every value in this MapDict'''
        for k,v in self.items():
            self[k] = func(v,*argv)
        self.clean()

    def clean(self):
        while True:
            try:
                for k,v in self.items():
                    if not v:
                        del self[k]
                return
            except RuntimeError:
                pass


>>> data=MapDict( (k,k) for k in range(20) )
>>> data
MapDict({0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9, 10: 10, 11: 11, 12: 12, 13: 13, 14: 14, 15: 15, 16: 16, 17: 17, 18: 18, 19: 19})
>>> from operator import add, mod
>>> data.map(mod,2)
>>> data
MapDict({1: 1, 3: 1, 5: 1, 7: 1, 9: 1, 11: 1, 13: 1, 15: 1, 17: 1, 19: 1})
>>> data.map(add,10)
>>> data
MapDict({1: 11, 3: 11, 5: 11, 7: 11, 9: 11, 11: 11, 13: 11, 15: 11, 17: 11, 19: 11})
>>> 

so that is why I could not make a new dict, and I want to only keep in my instance only the relevant values, that later I need to something else.

So is there a better way to do this clean? while keeping it memory efficient? and in the least amount of trip?

Copperfield
  • 8,131
  • 3
  • 23
  • 29
  • 1
    See http://stackoverflow.com/questions/9023078/custom-dict-that-allows-delete-during-iteration – Stuart Apr 01 '16 at 23:35
  • You probably do not want to writing sub-classes like that, I would look in to one of the functional programing libraries (toolz, funcy, etc) which provide nice streaming functions for most of this functionality. – tacaswell Apr 02 '16 at 00:44
  • @tcaswell maybe map is a bad name for that method, but that is exactly what I want, do the operation in-place, but `toolz` look great for the operations that I do that are not in-place – Copperfield Apr 02 '16 at 12:06

3 Answers3

2

It's not allowed to delete items from a dictionary while iterating over it, but you can iterate over a copy of keys (or items) instead:

for k in list(data):
    v = data[k]
    if not v:
        del data[k]
Eugene Yarmash
  • 142,882
  • 41
  • 325
  • 378
  • but making a copy of the key is just what I want to avoid, otherwise I would no ask this question – Copperfield Apr 01 '16 at 23:12
  • 2
    @Copperfield. You can't safely modify the sequence you are iterating over. You *must* iterate over a copy of the keys. – chepner Apr 01 '16 at 23:14
2

The closest you could get to deleting the items on the fly with minimal memory usage would be to make the list of keys to delete during your first loop, and then delete them all afterwards. Then you're only copying those keys that will be deleted.

keys_to_del = []
for k, v in data.items():
    data[k] %= 2
    if data[k] == 0:
        keys_to_del.append(k)
for k in keys_to_del:
    del data[k]
Stuart
  • 9,597
  • 1
  • 21
  • 30
  • You could shorten it a bit by doing: for k in tuple(ik for ik, iv in data.items() if iv%2 == 0): ... Use iteritems for python 2.7 – SleepProgger Apr 01 '16 at 23:45
  • @SleepProgger No, that wouldn't change the values. And why use a tuple? – Stuart Apr 02 '16 at 01:05
  • With the ... replaced with "del data[k]" my code would do the same as yours. The creation of tuples should be a little bit faster. – SleepProgger Apr 02 '16 at 01:11
  • @SleepProgger No, your code wouldn't change the values in the dictionary, and so iterates twice through the dictionary items for no reason. If the OP didn't need to change the values first, they wouldn't need 2 loops in the first place (though they would still need to copy the keys before deleting anything). – Stuart Apr 02 '16 at 01:21
  • Oh, sorry, i see what you meant now. Nvm. my comments. – SleepProgger Apr 02 '16 at 01:28
1

Is there a hard requirement to do it in place, if not:

def clean(data):
    return {k: v for k, v in data.items() if v}

if so

def clean(data):
    remove_keys = tuple(k for k, v in data.items() if not v)
    for k in remove_keys:
        del data[k]
tacaswell
  • 84,579
  • 22
  • 210
  • 199
  • yes, I want to be memory efficient so making a copy of the items is a no – Copperfield Apr 01 '16 at 23:08
  • 1
    The dict-comprehension does not make a copy of the elements, just makes another reference to the underlying objects. – tacaswell Apr 01 '16 at 23:13
  • @Copperfield except that i would use a tuple instead of a list, this should be the best you can get. The second implementation does only copy (the reference to/value) IF the condition ("not v" in this case) is true. – SleepProgger Apr 01 '16 at 23:27