15

Based on the documentation for Raku's lookaround assertions, I read the regex / <?[abc]> <alpha> / as saying "starting from the left, match but do not not consume one character that is a, b, or c and, once you have found a match, match and consume one alphabetic character."

Thus, this output makes sense:

'abc' ~~ / <?[abc]> <alpha> /     # OUTPUT: «「a」␤ alpha => 「a」»

Even though that regex has two one-character terms, one of them does not capture so our total capture is only one character long.

But next expression confuses me:

'abc' ~~ / <?[abc\s]> <alpha> /     # OUTPUT: «「ab」␤ alpha => 「b」»

Now, our total capture is two characters long, and one of those isn't captured by <alpha>. So is the lookaround capturing something after all? Or am I misunderstanding something else about how the lookaround works?

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
codesections
  • 8,900
  • 16
  • 50
  • 7
    At first glance, that looks like a compiler bug to me. – Jonathan Worthington Aug 31 '21 at 19:53
  • What does it mean that your first example with a *negative lookaround* gives `Nil` return, i.e. `'abc' ~~ / <![abc]> /; #OUTPUT: Nil`, however, your second example with a _negative lookaround_ gives the same result as a _positive lookaround_: `'abc' ~~ / <![abc\s]> /; # OUTPUT: «「ab」␤ alpha => 「b」»` ? – jubilatious1 Sep 26 '21 at 02:13

1 Answers1

3

<?[ ]> and <![ ]> does not seem to support some backslashed character classes. \n, \s, \d and \w show similar results.

<?[abc\s]> behaves the same as <[abc\s]> when \n, \s, \d or \w is added.

\t, \h, \v, \c[NAME] and \x61 seem to work as normal.

Markus Jarderot
  • 86,735
  • 21
  • 136
  • 138
  • Do you mean to say, "`[abc]>` behaves the same whether-or-not `\n`, `\s`, `\d` or `\w` are added." ? – jubilatious1 Jan 05 '22 at 19:55
  • 1
    @jubilatious1 No. Without `\n`, `\s`, `\d` or `\d`, it works as it is supposed to. When you add `\n`, `\s`, `\d` or `\w`, it turns into `<[...]>`. – Markus Jarderot Jan 06 '22 at 16:50