-4

my code:

readfile = open("{}".format(file), "r")

lines = readfile.read().lower().split()

elements = """,.:;|!@#$%^&*"\()`_+=[]{}<>?/~"""
for char in elements:
    lines = lines.replace(char, '')

this works and removes the special characters. but I need help with striping "-" and " ' "

so for example " saftey-dance " would be okay but not " -hi- " but " i'll " is okay but not " 'hi "

i need to strip only the beginning and ending

its not a string it is a list.

how do I do this?

guide
  • 1
  • 2
  • Possible duplicate: [Removing punctuation except intra-word dashes Python](https://stackoverflow.com/questions/35613990/removing-punctuation-except-intra-word-dashes-python) – jpp May 28 '18 at 21:54
  • Possible duplicate of [Stripping everything but alphanumeric chars from a string in Python](https://stackoverflow.com/questions/1276764/stripping-everything-but-alphanumeric-chars-from-a-string-in-python) – Ashlou May 28 '18 at 21:57
  • its not a string, its a list. – guide May 28 '18 at 21:58

2 Answers2

1

May be you can try string.punctuation and strip:

import string

my_string_list = ["-hello-", "safety-dance", "'hi", "I'll", "-hello"]

result = [item.strip(string.punctuation) for item in my_string_list]
print(result)

Result:

['hello', 'safety-dance', 'hi', "I'll", 'hello']
niraj
  • 17,498
  • 4
  • 33
  • 48
0

First, using str.replace in a loop is inefficient. Since strings are immutable, you would be creating a need string on each of your iterations. You can use str.translate to remove the unwanted characters in a single pass.

As to removing a dash only if it is not a boundary character, this is exactly what str.strip does.

It also seems the characters you want to remove correspond to string.punctuation, with a special case for '-'.

from string import punctuation

def remove_special_character(s):
    transltation = str.maketrans('', '', punctuation.replace('-', ''))
    return ' '.join([w.strip('-') for w in s.split()]).translate(transltation)

polluted_string = '-This $string contain%s ill-desired characters!'
clean_string = remove_special_character(polluted_string)

print(clean_string)

# prints: 'This string contains ill-desired characters'

If you want to apply this to multiple lines, you can do it with a list-comprehension.

lines = [remove_special_character(line) for line in lines]

Finally, to read a file you should be using a with statement.

with open(file, "r") as f
    lines = [remove_special_character(line) for line in f]
Olivier Melançon
  • 21,584
  • 4
  • 41
  • 73