0

I am trying to get a regex expression to match multiple patterns with multilines but it keeps matching everything. For instance I want to match two times this code:

STDMETHOD(MyFunc)(
D2D1_SIZE_U size,
_In_opt_ CONST void *srcData,
) PURE;

STDMETHOD(MyFunc2)(
_In_opt_ CONST void *srcData2,
UINT32 pitch2, 
) PURE;

I followed this link:

How do I match any character across multiple lines in a regular expression?

and came up with this pattern:

\bSTDMETHOD\b((.|\n|\r)*)\bPURE\b

however it does not work. The ((.|\n|\r)*) matches the whole thing. I want it to stop when it finds "PURE". In other words a proper match would have given me two matches of the code above, but instead my expression only stops at the last "PURE" key word, making just one match.

let me know if you see why it does not work.

Community
  • 1
  • 1
gmmo
  • 2,577
  • 3
  • 30
  • 56

3 Answers3

0

You should use Lazy quantifiers, as they stop on the first match they find:

\bSTDMETHOD\b((.|\n|\r)*?)\bPURE\b

Tested on Regexr.com

enter image description here

Rodrigo López
  • 4,039
  • 1
  • 19
  • 26
0

Instead of (.|\n|\r)*, use .*? (dot with the non-greedy modifier ?) and add the s and g flags, like this:

/\bSTDMETHOD\b(.*?)\bPURE\b/sg

The s flag means that . matches \r and \n, and the g flag lets you capture all matching strings in the subject text.

Demo

elixenide
  • 44,308
  • 16
  • 74
  • 100
0

In Python, try:

txt='''\
STDMETHOD(MyFunc)(
D2D1_SIZE_U size,
_In_opt_ CONST void *srcData,
) PURE;

STDMETHOD(MyFunc2)(
_In_opt_ CONST void *srcData2,
UINT32 pitch2, 
) PURE;'''

import re

for i, m in enumerate(re.finditer(r'\bSTDMETHOD\b(.*?)\bPURE\b', txt, flags=re.S | re.M)):
    print 'Match {}:\n{}\n==='.format(i, m.group(1))

Prints:

Match 0:
(MyFunc)(
D2D1_SIZE_U size,
_In_opt_ CONST void *srcData,
) 
===
Match 1:
(MyFunc2)(
_In_opt_ CONST void *srcData2,
UINT32 pitch2, 
) 
===

Note the regex \bSTDMETHOD\b(.*?)\bPURE with the flags re.S | re.M

The re.S says that Make . match any character, including newlines

If you anchor your matches, you want re.M so that ^ and $ match the beginning and end of lines with re.S

dawg
  • 98,345
  • 23
  • 131
  • 206