0

I'm trying to use python/regex to search through lines of VBA code to determine if a certain string pattern exists but only when the string pattern is not commented out.

For example, let's say the string pattern of interest is "aaa bbb ccc" and the lines of code I have are:

'aaa bbb ccc
'   aaa bbb ccc
' xxx aaa bbb ccc
    'xxx aaa bbb ccc xxx 
xxx ' xxx aaa bbb ccc xxx 
aaa bbb ccc 'xxx 
xxx aaa bbb ccc

which is assigned to the Python variable of "vbaLines". I want my code to identify line 6 and line 7.

Here's my code:

re.findall("(?i)" + "(?<!\S)" + "aaa bbb ccc" + "(?!\S)", vbaLines)

The problem with my code is that it finds 6 occurrences of the pattern (all lines except for line 1). I want my code to only identify line 6 and line 7 because these are the only two lines with "aaa bbb ccc" where "aaa bbb ccc" is not commented out.

Also, I am unfamiliar with VBA, so I do not know if I am missing additional ways in which code is commented out.

braX
  • 11,506
  • 5
  • 20
  • 33
Chen
  • 21
  • 5
  • You should do a first pass over the code to replace comments with empty strings. – thebjorn Mar 28 '18 at 15:32
  • I think you should also consider *5* to handle lines like ***DoEvents ' allow user interaction*** – Gary's Student Mar 28 '18 at 15:38
  • You're missing line continuations, for one. Consider [this answer](https://stackoverflow.com/a/41616800/1188513); if you're trying to make a tool that supports all legal code, regular expresions won't cut it. – Mathieu Guindon Mar 28 '18 at 15:43
  • Also `Rem this is a comment using a deprecated but legal syntax` ;-) – Mathieu Guindon Mar 28 '18 at 15:54
  • I'm very curious about what you're *actually* trying to achieve here. Consider using an actual lexer and parser. Regex will only give you nightmares. Been there, done that. – Mathieu Guindon Mar 28 '18 at 15:57
  • @MathieuGuindon thank you. Ultimately, I am trying to make a tool that will output a high level simple summary of what the VBA code is doing and identify poor programming practice along the way (so good to know Rem is deprecated comment syntax). I posted this question because I am starting by identifying key statements and functions that are not commented out. Maybe you can point me to a tool that already does this or something similar? Or point me in a better direction. Also, what do you recommend for a VBA lexer/parser? Appreciate your time. – Chen Mar 28 '18 at 16:42

0 Answers0