0

I have a regular expression (see this question) used to match C function definitions in text file. In particular, I'm working on some git diff output.

f = open(input_file)
diff_txt = ''.join(f.readlines())
f.close

re_flags = re.VERBOSE | re.MULTILINE
pattern = re.compile(r"""
                      (^[^-+]) # Problematic line: Want to ensure we do not match lines with +/-
                      (?<=[\s:~])
                      (\w+)
                      \s*
                      \(([\w\s,<>\[\].=&':/*]*?)\)
                      \s*
                      (const)?
                      \s*
                      (?={)
                      """,
                      re_flags)

input file is a some raw git diff output generated in the usual way:

git diff <commit-sha-1> <commit-sha-2> > tmp.diff 

The first line (^[^-+]) in my regex string is problematic. Without this line the regex will successfully match all C/C++ functions in input_file, but with it, nothing is matched. I need this line because I wan't to exclude functions that were added or removed between the two repository revisions, and lines that are added and removed are identified as

+ [added line]
- [removed line]

I've read the docs and I can't seem to find where my error is, some help would be much appreciated.

UnchartedWaters
  • 522
  • 1
  • 4
  • 14

2 Answers2

0

- and + are special characters in regular expressions. Try escaping them with slashes - [^\-\+]

Erin
  • 1
  • 1
    I'm on my phone now so I can't give you a link to the documentation, but if you read it, it is clearly stated that special characters lose their special meaning when placed inside square brackets, `[...]`. However, if `^` is the first character in the square brackets, all characters not in the square brackets should be matched. – UnchartedWaters Aug 08 '17 at 19:36
0

See this question

Simply change the problematic line

(^[^-+])

to

^(?!\+|\-).*

Since we're using the negative lookahead operator ?!, we have to make sure to include the .* at the end of the line, otherwise nothing will match.

UnchartedWaters
  • 522
  • 1
  • 4
  • 14