-1

I'm trying to remove backslashes and hyphens from text using a regular expression to match for characters to remove.

SPL_CHAR2 = r"[,(.;)@\#?\|!-+_*=~<>/&$]+"

def remove_special_chars(text: str) -> str:
    text = re.sub(SPL_CHAR2, ' ', text)
    return text

The issue is that these characters are not being removed.

user6235442
  • 136
  • 7

1 Answers1

0

Those characters have special meaning in the regex definition. To make this work, rearrange the character set like so:

SPL_CHAR2 = r"[-,(.;)@\#?\\|!+_*=~<>/&$]+"

I've moved the dash to the start (so that it's no longer defining a character range) and have doubled the backslash.

Instead of moving the dash to the start, you can keep it where it is but add a backslash right before it.

The syntax is defined here: https://docs.python.org/2/library/re.html#regular-expression-syntax (search for []).

NPE
  • 486,780
  • 108
  • 951
  • 1,012