0

I am trying to use the following regex pattern \B@(@?\w+(?:::\w+)?)([ \t]*)(\( ( (?>[^()]+) | (?3) )* \))? using java.util.regex.Pattern but I keep getting the error Unknown inline modifier near index 49 \B@(@?\w+(?:::\w+)?)([ \t]*)(\( ( (?>[^()]+) | (?3) )* \))? ^ I have tried to escape the regex pattern using the \ character at the index it is complaining about but it still fails. Hoping someone here can help me get this working.

This is the test string I am trying to use it against:

Value @if(blah == 1) 'assigned' @else 'reassigned' @endif from boom to blah

If I put the pattern into the website regex 101 it works fine.

Vadim Kotov
  • 8,084
  • 8
  • 48
  • 62
Yamaha32088
  • 4,125
  • 9
  • 46
  • 97
  • Following NahuelFouilleul and SebastianProske's comments it seems likely that the pattern you're trying to use really means to use `(?3)`, by which it would reference the third capturing group. That's a valid syntax in some PCRE implementations, but isn't implemented in Java. Replacing it with `3` will make the regex compile, but it won't match what you expect. – Aaron Jan 10 '18 at 15:01

1 Answers1

3

(?3) isn't valid in Java. It is parsed as an "inline modifier" as mentioned in the error message, which is a way to activate a flag for the remainder of the regex (or until an opposite (?-X) is encountered), for example (?i) to enable case-insensitive search. There's no flag named 3, hence the error.

It is however valid in some PCRE implementations (most notably in Perl which is the reference implementation for PCRE) and makes it possible to refer to a capturing group, enabling at the same time the possibility to define recursive patterns. This is how it is used in this regex.

Rewriting the regex to be compatible with Java would require some non-trivial work, and it would be interesting to ponder whether a regex implementation is still preferable to some other code without this feature.

Aaron
  • 24,009
  • 2
  • 33
  • 57
  • That was it! Thank you so much – Yamaha32088 Jan 10 '18 at 14:37
  • `(?3)` can be valid for engine which support recursive regex which seems to be the case of question because third capturing group matches balanced parentheses – Nahuel Fouilleul Jan 10 '18 at 14:44
  • `(?3)` in PCRE is a recursion to the third subpattern, this looks very much like a pattern to match nested parenthesis. Java regex patterns do not support recursion. Regex101 uses PCRE as default, that's the reason it works over there. – Sebastian Proske Jan 10 '18 at 14:44
  • @NahuelFouilleul & SebastianProske Java's regex engine is PCRE-based but doesn't implement all features and can differ from Perl's reference implementation on some points. I don't think it implements recursive patterns, have you got any evidence it does? (I will test it, but if you have documentation that would be faster and better :) ) – Aaron Jan 10 '18 at 14:47
  • no it's not supported in java, but the pattern in the question suggests a recurisve regex – Nahuel Fouilleul Jan 10 '18 at 14:53
  • @Aaron it is not supported, see e.g. https://stackoverflow.com/questions/8659764/how-can-i-recursively-match-a-pattern-using-regular-expressions - however the regex tester OP used is PCRE based and thus does support it. – Sebastian Proske Jan 10 '18 at 14:58
  • Thank you both, I'm currently writing a comment on OP's question to raise this concern in case he misses those comments. I guess I'll rewrite my answer to add what meaning is expected for the regex – Aaron Jan 10 '18 at 15:00