1

starting from a file text with characters like "!" or "," (basically, the entire set of string.punctuation) i want to remove them and obtain a text with only all words. Here i found a solution: https://gomputor.wordpress.com/2008/09/27/search-replace-multiple-words-or-characters-with-python/ and i wrote the script in this way:

import string

dict={}
for elem in string.punctuation:
    dict[elem]=""

def replace_all(text, dic):
    for i, j in dic.items():
        text = text.replace(i, j)
    return text

with open ("text.txt","r") as f:
    file = f.read()
    f = replace_all(file,dict)

print(f)

ok this works, but if i try this another solution, i will not have the same result:

with open ("text.txt","r") as f:
    file = f.read()
    for elem in string.punctuation:
        if elem in file:
            f=file.replace(elem,"")

In this case, if i typing print(f) i have exactly the same file with all punctuations. Why?

Giacomo Ciampoli
  • 821
  • 3
  • 16
  • 33
  • In the second version every time you execute `f=file.replace(elem,"")` you're taking your original file and only replacing the current value of `elem`. If you look at your output more carefully I suspect that the last element from `string.punctuation` that is in your file has been removed. – Alain Apr 08 '17 at 15:58
  • Use `str.translate()`, it is by far the most efficient method. See the duplicate. – Martijn Pieters Apr 08 '17 at 15:59
  • thanks to all, i've seen the duplicate post but the question here is not wich method works better(there are many of course), but i had to understand what happens inside the for in the second script – Giacomo Ciampoli Apr 08 '17 at 16:06

1 Answers1

1

I'd use filter to search and replace multiple items:

import string
testString = "Hello, world!"
print(str(filter(lambda a: a not in string.punctuation, testString)))

Regex

If you want to remove all nonalphanumeric characters, regex would be preferable:

import string, re
testString = "Hello, world!"
print(re.sub("[^\w ]", "", testString))

Why your code doesn't work

Two major issues:

  1. You're reassigning f instead of file.
  2. You're not printing the file, so I added the line print(file)

New Code:

import string

with open ("text.txt","r") as f:
    file = f.read()
    for elem in string.punctuation:
        if elem in file:
            file=file.replace(elem,"")
    print(file)
Neil
  • 14,063
  • 3
  • 30
  • 51