2

Given the following input:

$ cat liltester
      if ((ret = utMemAlloc(
                   pManeuverObj->util.hMemory,
                   1,
                   (usTxtLen + 1),
                   (void **)&pMnvr->Context.pDestinationString
                 )) < 0)

The following produces the expected output (it strips out everything outside the outer parens)

$ perl -0 -ne 'print $1 if /((?:\((?>[^()]|(?R))*\)))/g' liltester

I grabbed that from https://www.regular-expressions.info/recurse.html , by the way. However, it's been modified to 1) capture, and have the "balanced" portion be inside a non-capturing group. The idea being I can do this

$ perl -0 -ne 'print $1 if /(utMemAlloc(?:\((?>[^()]|(?R))*\)))/g' liltester

without modifying ( being considered as my opening paren. (As obviously trying to match utMemAlloc( with ) is not going to work well.)

However, the output is a blank line. Expected output is:

utMemAlloc(
                   pManeuverObj->util.hMemory,
                   1,
                   (usTxtLen + 1),
                   (void **)&pMnvr->Context.pDestinationString
                 )

My end goal, for what it's worth, is to find instances of utMemAlloc that use pDestinationStringin the parameter list.

The following produces the expected output, by the way, but I'd prefer to avoid it for several reasons (one of which is that $RE{balanced} seems to blow up perl for an entire shell instance whenever I use it wrong):

perl -MRegexp::Common -0 -ne 'print $1 if /(utMemAlloc$RE{balanced}{-parens=>'"'"'()'"'"'})/g' liltester

Optional Reading

The other reason I prefer to avoid Regexp::Common is that I often use perl in a mingw terminal provided by a git UI..Basically to avoid having to push code through git to a linux box. The actual code I ended up with (thanks to the current answer) is:

$ git grep -l 'pDestinationString' | 
xargs perl -0 -lne 'print for /(utMemAlloc\s*(\((?>[^()]|(?-1))*\)))/g' | 
perl -0 -ne 'print "$_\n\n\n" if /utMemAlloc[\s\S]*pDestinationString/'

The 2nd test for utMemAlloc was necessary because there are two capture groups in the first expression, and when I tried to make the inner one a non-capturing group, the whole expression stopped working again. This works, but it's damn ugly.

zzxyz
  • 2,953
  • 1
  • 16
  • 31
  • Did you consider the core [Text::Balanced](https://perldoc.perl.org/Text/Balanced.html)? It won't blow up anything. See for example [this post](https://stackoverflow.com/a/46121634/4653379) – zdim Mar 23 '18 at 00:17
  • @zdim - I'll check it out next time. I'm not sure learning this syntax is worthwhile. (also see my edit at the bottom of my question) – zzxyz Mar 23 '18 at 00:37
  • @zdim - sorry, by "this", I mean the syntax *I'm* trying to use, not what you're suggesting. – zzxyz Mar 23 '18 at 00:46
  • Aaand, I found the memory error. All hail perl. – zzxyz Mar 23 '18 at 01:50

1 Answers1

1

With $^R you recurse to the beginning of the whole pattern, apparently this is not what you want.
If you recurse to the paren character you will get the desired result:

perl -0 -ne 'print $1 if /(utMemAlloc(\((?>[^()]|(?-1))*\)))/g' liltester


utMemAlloc(
               pManeuverObj->util.hMemory,
               1,
               (usTxtLen + 1),
               (void **)&pMnvr->Context.pDestinationString
             )
wolfrevokcats
  • 2,100
  • 1
  • 12
  • 12
  • This answers my question. Curious if you have any thoughts on the edit (bottom). Also wondering why this stops working if I change `utMemAlloc(` to `utMemAlloc(?:` (non-capturing) – zzxyz Mar 23 '18 at 00:40
  • Also, blessings be upon you `curl -sN https://www.kingjamesbibleonline.org/Psalms-23-4/ | perl -0 -ne 'print $1 if /(

    .*?<\/p>)/' | perl -lpe 's/\<.*?\>//g; s/\b\d?\K(the lord|thou\b|he\b)/Perl/gi; s/thy/Perl'"'"'s/gi; s/his/its/gi; s/(\d)/\n$1 /g'`

    – zzxyz Mar 23 '18 at 00:46
  • 2
    @zzxyz, thanks for the blessing, very funny. As to the update, you've probably already known that `(?-1)` refers to the latest capturing subgroup, and with `(?:` you make the subgroup non-capturing. – wolfrevokcats Mar 23 '18 at 11:24
  • @wolkrevokcats - I actually didn't know about `(?-1)`..I couldn't find any reference material that mentioned it. I finally started just playing with it to figure out it referred to capture groups, which is why using a non-capturing group doesn't work. – zzxyz Mar 23 '18 at 17:06
  • 1
    Finding related docs is as easy as running `perldoc perlre` and searching for `?R`. You can even find it through the table of contents - `perldoc toc`. – wolfrevokcats Mar 23 '18 at 17:23
  • Yeah, I really need to remember that perldoc is usually better than google for finding...perl documentation :) – zzxyz Mar 23 '18 at 17:34