0

I have a large dictionary in python(https://docs.google.com/document/d/1aNuwIJGRMQA2iSwdT2iliG9Q4Zu2bELNMrQVacEcSdc/edit?usp=sharing) and I want to remove all of the entries that have certain characteristics.For example, I want to remove all of the entries in which The second item in the tuple is equal to True, but the third is not equal to block.I was trying to do this with a regular expression, but I couldn't seem to get it working.

edit:My basic idea was to do something like this.

regex="Regular Expression"

for entry in d:
    if len(re.findall(regex,str(entry)))!=0:
        del d[entry]
print(d)
David Greydanus
  • 2,551
  • 1
  • 23
  • 42
  • Does iterating not work? – Justin Jasmann Feb 26 '14 at 02:22
  • 1
    Have you looked at `filter`? – merlin2011 Feb 26 '14 at 02:22
  • When you say "it doesn't work", what exactly do you mean? Is it throwing an error, or just not producing desired results? It looks to me like one problem is that you are iterating over the dictionary while deleting from it. You can fix this by changing `for entry in d:` to `for entry in d.keys():`, as mentioned in this answer: http://stackoverflow.com/a/5385075/391161. – merlin2011 Feb 26 '14 at 02:24
  • Definitely look up pandas. You can label these columns and your query will look something like `g[(not g.happy) & (g.action != 'block')]`. – U2EF1 Feb 26 '14 at 02:34

4 Answers4

0

I don't think regular expressions is the way to go here, if your dictionary is actually in Python.

Here's a sample of your data:

g = {
(0, False, None, 0, False, None):(False,False,True),
(0, True, fire, 0, True, fire):(1/1,1/1,1/1),
(0, True, fire, 0, True, block):(1/1,1/1,1/1),
(0, True, fire, 0, True, reload):(1/1,1/1,1/1),
(0, True, fire, 0, False, fire):(1/1,1/1,1/1),
(0, True, fire, 0, False, block):(1/1,1/1,1/1),
(0, True, fire, 0, False, reload):(1/1,1/1,1/1),
(0, True, fire, 1, True, fire):(1/1,1/1,1/1),
(0, True, fire, 1, True, block):(1/1,1/1,1/1),
(0, True, block, 2, False, reload):(1/1,1/1,1/1),
(0, True, block, 3, True, fire):(1/1,1/1,1/1),
(0, True, block, 3, True, block):(1/1,1/1,1/1),
(0, True, block, 3, True, reload):(1/1,1/1,1/1),
(6, False, reload, 6, True, reload):(1/1,1/1,1/1),
(6, False, reload, 6, False, fire):(1/1,1/1,1/1),
(6, False, reload, 6, False, block):(1/1,1/1,1/1),
(6, False, reload, 6, False, reload):(1/1,1/1,1/1),
}

And so I would use the following, since list comprehensions and generator statements have virtually replaced map and filter in Python, now, and filtering for where the second element of the keys is True and the third element is not equal to block:

selected_keys = [i for i in g.keys() if i[1] == True and i[2] != block]

Then you can access the dict by each key you've filtered for.

For example:

for key in selected_keys:
    print(g[key])

would print the value associated with each key.

Russia Must Remove Putin
  • 374,368
  • 89
  • 403
  • 331
0
def remove(key, value):
    return key[1] == True and key[2] != block

g = {key:value for key,value in g.iteritems() if not remove(key, value)}
Hugh Bothwell
  • 55,315
  • 8
  • 84
  • 99
0

A very simple way of doing that would be.

for entry in d.keys():
    if entry[1]==True and entry[2]!=block:
        del d[entry]

In this case the loop for needs to read the keys() method, so the change on the dictionary size does not affect the loop.

Hope it helps.

mrcl
  • 2,130
  • 14
  • 29
0

Some basic text processing will get you a nice csv like this:

a,b,action1,c,d,action2,e,f,g,h,i,j
0,True,fire,0,True,fire,1,1,1,1,1,1
0,True,fire,0,True,block,1,1,1,1,1,1
0,True,fire,0,True,reload,1,1,1,1,1,1
0,True,fire,0,True,fire,1,1,1,1,1,1
0,True,fire,0,True,block,1,1,1,1,1,1
0,True,fire,0,True,reload,1,1,1,1,1,1
0,True,fire,1,True,fire,1,1,1,1,1,1
0,True,fire,1,True,block,1,1,1,1,1,1
0,True,fire,1,True,reload,1,1,1,1,1,1
0,True,fire,1,True,fire,1,1,1,1,1,1

This can be read in via pandas.read_csv. Obviously my column names are garbage. Once we have a dataframe, we can subdivide it by column values very easily. Your query is just df[df.b & (df.action1 != 'block').

U2EF1
  • 12,907
  • 3
  • 35
  • 37