0

First, I am aware that it has been asked here, but it is really old so the bug might be fixed. Here is my question:

I am writing a program to remove all symbols from a string. Here is my code:

text = input("Enter some text ")
symbols = ['!', '@', '#', '$', '%', '^', '&', '*', '(', ')', '_', '+', '-', '=', '{', '}', '|', '[', ']', ':', '\\', '"', ';', "'", '<', '>', '?', ',', '.', '/']
text = re.sub("|".join(symbols), "", text)
print(text)

When I run this, I get re.error: nothing to repeat at position 14. Anybody know how to fix this?

Retluoc3002
  • 29
  • 1
  • 1
  • 4
  • 1
    You need to regex-escape those characters, some of them (`*` is the first one causing the error in this specific case) have actual meaning. You can use [`re.escape`](https://docs.python.org/3/library/re.html#re.escape), for example, although it's a touch over-zealous. I'd recommend using https://regex101.com/r/JOjEAb/1 first when you have regex issues - this *isn't* a bug in Python. – jonrsharpe Mar 31 '20 at 15:47
  • @jonrsharpe How do I regex-escape them? – Retluoc3002 Mar 31 '20 at 15:49
  • Have you read e.g. https://docs.python.org/3/library/re.html? – jonrsharpe Mar 31 '20 at 15:49
  • 1
    As @jonrsharpe says, have a read in the docs, and use an online regex parser to see whats actually happening. `*` in regex means 0 or more times, but it applies to the previous value, in your regex you end up with `|*|` so you say or 0 or more times but there is no expression to apply this repeat to. You need to regex escape all your symbols like `text = re.sub("|".join([re.escape(symbol) for symbol in symbols]), "", text)` – Chris Doyle Mar 31 '20 at 15:55

0 Answers0