8

How can I use capturing groups inside lookbehind assertions?

I tried to use the same formula as in this answer. But that does not seem to work with lookbehinds.

Conceptually, this is what I was trying to do.

say "133" ~~ m/ <?after $0+> (\d) $ /

I know this can be easily achieved without lookbehinds, but ignore that just for now :)

For this I tried with these options:

Use :var syntax:

say "133" ~~ m/ <?after $look-behind+> (\d):my $look-behind; $ /;
# Variable '$look-behind' is not declared

Use code block syntax defining the variable outside:

my $look-behind;
say "133" ~~ m/ <?after $look-behind+> (\d) {$look-behind=$0} $ /;
# False

It seems that the problem is that the lookbehind is executed before the "code block/:my $var", and thus the variable is empty for the lookbehind tree.

Is there a way to use capturing groups inside lookbehinds?

Julio
  • 5,208
  • 1
  • 13
  • 42
  • A, er, curiosity: `say . given foo ~~ m/ $ /` will work for any string in `foo` that ends with the pattern `bar`, which is fair enough... and displays a capture that's the string in `foo` *in its entirety* (not just `bar`) and entirely *flipped* (backwards)! – raiph Dec 26 '20 at 00:15
  • "I know this can be easily achieved without lookbehinds, but ignore that just for now :)" I've ignored that for a day, but now's a new now. :) So, what is it that you're *really* trying to do? – raiph Dec 26 '20 at 00:17
  • Regexes in Raku are code. You wouldn't expect a variable to have data in it before you set it, so why would you expect the same from a Regex? – Brad Gilbert Dec 26 '20 at 17:59
  • 1
    Hi @raiph I tried to solve Day 15 of advent of code using regexes (and inline code in them). I finally reversed the input numbers and used lookeaheads instead of lookbehinds. But, I figured there would exist a way to reference capturing groups in lookbehinds using raku, so I asked... :) – Julio Jan 16 '21 at 17:29
  • @BradGilbert Well, I was not expecting anything xD, that was something I tried just in case... – Julio Jan 16 '21 at 17:30
  • 1
    @Julio Thanks. I figured out what you were asking after I posted my comments. In case you haven't yet figured this out, using jnthn's answer to your previous Q about *lookahead* (`/ (a) :my $lookahead; /`), a simple translation of that to a *lookbehind* version solving this Q would be `/ (a) :my $lookbehind; { $lookbehind = $/ } $ /;`. (Which is more or less the same as .@WiktorStribiżew's answer.) – raiph Jan 16 '21 at 19:21

1 Answers1

8

When you reference a captured value before it is actually captured, it is not initialized, hence you can't get a match. You need to define the capturing group before actually using the backreference to the captured value.

Next, you need to define a code block and assign the backreference to a variable to be used throughout the regex pattern, else, it is not visible to the lookbehind pattern. See this Capturing Raku reference:

This code block publishes the capture inside the regex, so that it can be assigned to other variables or used for subsequent matches

You can use something like

say "133" ~~ m/ (\d) {} :my $c=$0; <?after $c ** 2> $ /;

Here, (\d) matches and captures a digit, then a code block is used to assign this captured value to a $c variable, and then the <?after $c ** 2> lookbehind checks if the $c value appears at least twice immediately to the left of the current location, and then the $ anchor checks if the current position is the end of the string.

See this online Raku demo.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • 1
    Superb! Nice trick to postpone the lookbehind and read again the captured group on it! – Julio Jan 16 '21 at 17:30
  • It would be worth pointing out that the reason `{}` is needed is so that `$/` gets updated. ( `$0` is really a shortcut for `$/[0]`) – Brad Gilbert Jan 17 '21 at 18:08