4

I want to remove values from a dictionary if they contain a particular string and consequentially remove any keys that have an empty list as value.

mydict = {
    'Getting links from: https://www.foo.com/': 
    [
        '+---OK--- http://www.this.com/',
        '+---OK--- http://www.is.com/',
        '+-BROKEN- http://www.broken.com/',
        '+---OK--- http://www.set.com/',
        '+---OK--- http://www.one.com/'
    ],
    'Getting links from: https://www.bar.com/': 
    [
        '+---OK--- http://www.this.com/',
        '+---OK--- http://www.is.com/',
        '+-BROKEN- http://www.broken.com/'
    ],
    'Getting links from: https://www.boo.com/':
    [
        '+---OK--- http://www.this.com/',
        '+---OK--- http://www.is.com/'
    ]
}

val = "is"

for k, v in mydict.iteritems():
   if v.contains(val):
     del mydict[v]

The result I want is:

{
    'Getting links from: https://www.foo.com/':
    [
        '+-BROKEN- http://www.broken.com/',
        '+---OK--- http://www.set.com/',
        '+---OK--- http://www.one.com/'
    ], 
    'Getting links from: https://www.bar.com/': 
    [
        '+-BROKEN- http://www.broken.com/'
    ]
}

How can I remove all dictionary values that contain a string, and then any keys that have no values as a result?

j-i-l
  • 10,281
  • 3
  • 53
  • 70
lane
  • 659
  • 8
  • 24
  • 2
    you should not `del` an item of a dictionary within a loop that runs through this dictionary. You could use dict-comprehension to solve your issue. – j-i-l Dec 31 '18 at 12:33

5 Answers5

6

You can use a list comprehension within a dictionary comprehension. You shouldn't change the number of items in a dictionary while you iterate that dictionary.

res = {k: [x for x in v if 'is' not in x] for k, v in mydict.items()}

# {'Getting links from: https://www.foo.com/': ['+-BROKEN- http://www.broken.com/',
#                                               '+---OK--- http://www.set.com/',
#                                               '+---OK--- http://www.one.com/'],
#  'Getting links from: https://www.bar.com/': ['+-BROKEN- http://www.broken.com/'],
#  'Getting links from: https://www.boo.com/': []}

If you wish to remove items with empty list values, you can in a subsequent step:

res = {k: v for k, v in res.items() if v}
jpp
  • 159,742
  • 34
  • 281
  • 339
  • Awesome. Thanks! One last question: if I wanted to remove multiple string values at once, could I add some sort of an 'or' argument? Such that, 'if 'is' or 'foo' or 'bar' not in x' – lane Dec 31 '18 at 13:02
  • 1
    `[x for x in v if not any(item in x for item in ['foo', 'bar'])]` would work – jpp Dec 31 '18 at 13:16
4

With simple loop:

val = "is"

new_dict = dict()
for k, v in mydict.items():
    values = [i for i in v if val not in i]
    if values: new_dict[k] = values

print(new_dict)

The output:

{'Getting links from: https://www.foo.com/': ['+-BROKEN- http://www.broken.com/', '+---OK--- http://www.set.com/', '+---OK--- http://www.one.com/'], 'Getting links from: https://www.bar.com/': ['+-BROKEN- http://www.broken.com/']}
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105
3

This is a one-liner:

{k: [e for e in v if val not in e] for k, v in mydict.items() if any([val not in e for e in v])}

The expected output:

Out[1]: {
    'Getting links from: https://www.bar.com/': 
    [
        '+-BROKEN- http://www.broken.com/'
    ],
    'Getting links from: https://www.foo.com/': 
    [
        '+-BROKEN- http://www.broken.com/',
        '+---OK--- http://www.set.com/',
        '+---OK--- http://www.one.com/'
    ]
}
j-i-l
  • 10,281
  • 3
  • 53
  • 70
  • 1
    The one-liner works, of course, but hides the fact you are explicitly checking for `'is'` in your values twice. That double string checking isn't required, better to use 2 lines. – jpp Dec 31 '18 at 12:35
  • 1
    @jpp in a 2 liner, on the other hand, you create an additional intermediary dict object, which is also not strictly required. – j-i-l Dec 31 '18 at 12:38
  • Except the expensive part isn't checking if a list is non-empty, it's checking if any strings within a list contain a substring. I suggest timing this for a larger input to satisfy yourself this is the case. – jpp Dec 31 '18 at 12:40
0

There are a couple of ways you could do it. One using regex and one without.

if you're not familiar with regex you could try this:

for key, value in mydict.items():
    if val in value:
        mydict.pop(key)

output would be:

    mydict = {'Getting links from: https://www.bar.com/': ['+---OK--- http://www.this.com/',
  '+---OK--- http://www.is.com/',
  '+-BROKEN- http://www.broken.com/'],
 'Getting links from: https://www.boo.com/': ['+---OK--- http://www.this.com/',
  '+---OK--- http://www.is.com/'],
 'Getting links from: https://www.foo.com/': ['+---OK--- http://www.this.com/',
  '+---OK--- http://www.is.com/',
  '+-BROKEN- http://www.broken.com/',
  '+---OK--- http://www.set.com/',
  '+---OK--- http://www.one.com/']}
Aaron_ab
  • 3,450
  • 3
  • 28
  • 42
Brandon Bailey
  • 781
  • 6
  • 12
0

Using dict comprehension, you can try the following:

import re

val = 'is'

# step 1 - remove line having is
mydict = {k:[re.sub(r'.*is*.', '', x) for x in v] for k,v in mydict.items()}

# filtering out keys if there is no value - if needed
mydict = {k:v for k,v in mydict.items() if len(v) > 0}

print(mydict)

{'Getting links from: https://www.foo.com/': ['com/',
  'com/',
  '+-BROKEN- http://www.broken.com/',
  '+---OK--- http://www.set.com/',
  '+---OK--- http://www.one.com/'],
 'Getting links from: https://www.bar.com/': ['com/',
  'com/',
  '+-BROKEN- http://www.broken.com/'],
 'Getting links from: https://www.boo.com/': ['com/', 'com/']}
YOLO
  • 20,181
  • 5
  • 20
  • 40