0

I know this was covered elsewhere but my use case is causing me difficulty.

What if a string has a dreaded "\3" character like this one:

new_data = r'C:\temp\3_times.csv'

...then re "thinks" that you're referring to it as a group so if you try to do a sub using that data then you get this error:

newfiledata = re.sub(old_data,new_data,filedata)

error: invalid group reference

Is there any way to avoid this error without looking for that case in the string and modifying it directly before passing it which would take lots of extra code.

Note: For the usage I'm going for modifying the string with escape characters isn't an option because I need to write the string with the sub function later on. So this is not a duplicate of the question on how to escape special characters.

sparrow
  • 10,794
  • 12
  • 54
  • 74
  • 4
    In any regex, literal backslashes have to be escaped. As a regex it would be `r'C:\\temp\\3_times\.csv'` What you should do is to escape metachars of your regex literals. You can do a sub using `r'([.^#|*+?()\[\]{}\\-])'` replace with `\\$1` –  Jul 19 '17 at 20:33
  • @sln doesn't `r` make it a raw string, and make it unnecessary to escape your slashes? or is that not the case? – jacoblaw Jul 19 '17 at 20:34
  • @jacoblaw - it's what you're passing to the regex engine. All literal metacharacters have to be escaped. –  Jul 19 '17 at 20:39
  • Sorry, that should be sub using `r'([.^$|*+?()\[\]{}\\-])'` replace with `\\$1` –  Jul 19 '17 at 20:40
  • Looks like re.escape() uses `\W` which is unfortunate. –  Jul 19 '17 at 21:05

1 Answers1

1

You could simply use re.escape():

import re
new_data = re.escape('C:\temp\3_times.csv')

... which escapes special characters, see https://docs.python.org/2/library/re.html for more information.

Jan
  • 42,290
  • 8
  • 54
  • 79
  • `Return string with all non-alphanumerics backslashed` I wouldn't use this. It's equivalent to `\W`, escapes all metachars, but also escapes control, whitespace, and probably a large swath of Unicode. –  Jul 19 '17 at 21:03
  • This returns: 'C\\:\\\temp\\\x03_times\\.csv'...which serves the purpose of escaping the meta characters, however since it modifies the string it defeats the purpose of using it with the sub method to write the data. – sparrow Jul 20 '17 at 14:19