1

I have a python piece of code that was working on 3.5.4 but now is broken on 3.6.8. It looks like related to how regex are managed by python. Found some thread, but I don't understand how to fix my problem.

Do I need to change the string given to regex r'^\s*#\s*include\s+"%s"' % str(moc_cpp[0])?

Stacktrace :

error: bad escape \m at position 37:
  File "c:\python\python3.6.8\lib\site-packages\SCons\Builder.py", line 353:
    target, source = e(target, source, env)
  File "C:\SVN\3rdParty\devTool\site_scons\site_tools\qt5\__init__.py", line 373:
    cpp, cpp_contents, out_sources)
  File "C:\SVN\3rdParty\devTool\site_scons\site_tools\qt5\__init__.py", line 229:
    if cpp and re.search(inc_moc_cpp, cpp_contents, re.M):
  File "c:\python\python3.6.8\lib\re.py", line 182:
    return _compile(pattern, flags).search(string)
  File "c:\python\python3.6.8\lib\re.py", line 301:
    p = sre_compile.compile(pattern, flags)
  File "c:\python\python3.6.8\lib\sre_compile.py", line 562:
    p = sre_parse.parse(p, flags)
  File "c:\python\python3.6.8\lib\sre_parse.py", line 855:
    p = _parse_sub(source, pattern, flags & SRE_FLAG_VERBOSE, 0)
  File "c:\python\python3.6.8\lib\sre_parse.py", line 416:
    not nested and not items))
  File "c:\python\python3.6.8\lib\sre_parse.py", line 502:
    code = _escape(source, this, state)
  File "c:\python\python3.6.8\lib\sre_parse.py", line 401:
    raise source.error("bad escape %s" % escape, len(escape))

Code:

        inc_moc_cpp = _contents_regex(r'^\s*#\s*include\s+"%s"' % str(moc_cpp[0]))
        if cpp and re.search(inc_moc_cpp, cpp_contents, re.M): 

def _contents_regex(e):
    # get_contents() of scons nodes returns a binary buffer, so we convert the regexes also to binary here
    # this won't work for specific encodings like UTF-16, but most of the time we will be fine here.
    # note that the regexes used here are always pure ascii, so we don't have an issue here.
    if sys.version_info.major >= 3:
        e = e.encode('ascii')
    return e
peterphonic
  • 951
  • 1
  • 19
  • 38

1 Answers1

2

moc_cpp[0] probably contains an invalid escape sequence. If you intend to match it literally rather than as a regexp, you should escape it before substituting into the regexp.

inc_moc_cpp = _contents_regex(r'^\s*#\s*include\s+"%s"' % re.escape(str(moc_cpp[0])))
Barmar
  • 741,623
  • 53
  • 500
  • 612
  • `moc_cpp[0]` == `MSVC14.1\x86\debug\moc_ASREngineSpawner.cc` So, if I understand your answer, I should escape \ to have \\ instead in my string? – peterphonic Jul 28 '20 at 20:16
  • Yes. That's what `re.escape()` will do. – Barmar Jul 28 '20 at 20:17
  • Also, if the string contained other characters that have special meaning in a regexp, like `*` or `+`, it might not get an error, but it won't match correctly. – Barmar Jul 28 '20 at 20:19