1

I have a dictionary:

d = {'<word>':1,'-word':12, 'word':1, '$(*#%&^#&@#':2, '!@**$12word*&':4, '::':10, '1230324':1, '+635':5}

I want to remove only the entries which are all non-alphabet/non-digit characters, i.e. , . ? ! : ; and so on.

I've tried the following

regex = re.compile('[\!\?\.\,\:\;\*\(\)\-\+\<\>]')
regex = re.compile('a-zA-Z0-9_')
regex = re.compile('\\W')
regex = re.compile('[\W_]+') // from [1]

but they won't return my desired result, which is:

new_dict = {'<word>':1,'-word':12, 'word':1, '!@**$word*&':4, '1230324':1, '+635':5}

in which entries '$(*#%&^#&@#' and :: are removed.

Also, I use this code to remove the entries, in case it helps:

new_dict = {k:dictionary[k] for k in dictionary if re.match(regex, k)}

[1] Stripping everything but alphanumeric chars from a string in Python

CH123
  • 251
  • 1
  • 5
  • 15
  • Possible duplicate of [Stripping everything but alphanumeric chars from a string in Python](https://stackoverflow.com/questions/1276764/stripping-everything-but-alphanumeric-chars-from-a-string-in-python) – Gahan Sep 08 '17 at 04:18

1 Answers1

1

You want to match the whole string for \W with ^\W+$.

Something like this will do:

$ cat test.py
import re

pattern = r"^\W+$"

d = {'<word>':1,'-word':12, 'word':1, '$(*#%&^#&@#':2, '!@**$12word*&':4, '::':10, '1230324':1, '+635':5}

for k in d.keys():
    matches = re.search(pattern, k)
    if (matches):
        print 'to remove: ' + k
        del d[k]

for k in d.keys():
    print k

EDIT: question changed: OP wants to create dict in one go. Can be done like this:

new_dict = {k:d[k] for k in d.keys() if not(re.search(pattern,k))}
Marc Lambrichs
  • 2,864
  • 2
  • 13
  • 14
  • Thank you so much! This worked. I just needed to make a copy of the dictionary first and use that for the loop because this code alone produces an error, saying that 'dictionary changed size during iteration'. – CH123 Sep 08 '17 at 04:51