3

I have a pattern with opening tags and closing tags
e.g. /*tag1_START*/ some content /*tag1_END*/ other text /*tag2_START*/ some content /*tag2_END*/

and i use the Regex \/\*([a-zA-Z0-9]+)_START\*\/(.*?)\/\*\1_END\*

can see @ regex101

BUT, There was a situation where the tags were interleaved (mistakingly):
e.g. /*tag3_START*/ some /*tag4_START*/ content /*tag3_END*/ other /*tag4_END*/ content

I can easily check the overlap in the matches, but REGEX does not return Both tags because it continue from the last char it matched...

Can i use Regex to find Overlapping matches or i need to write my own code ?

Tomer W
  • 3,395
  • 2
  • 29
  • 44
  • Sound like [recursion in Regex](https://stackoverflow.com/questions/26385984/recursive-pattern-in-regex). – Uwe Keim Feb 07 '18 at 11:41
  • 1
    Use lookarounds here [`\/\*([a-zA-Z0-9]+)_START\*\/(?=(.*?)\/\*\1_END\*)`](https://regex101.com/r/fGKYSD/3) – revo Feb 07 '18 at 11:46
  • @WiktorStribiżew just that it will find it... (i'll check for the actual overlap myself) just the Index and Length of all matchs – Tomer W Feb 07 '18 at 11:50
  • revo.. great... put it as an answer. – Tomer W Feb 07 '18 at 12:31
  • @WiktorStribiżew you are partially correct, the match doesnt include the entire expression. but i traverse the capture groups anyway, so i do have the positions i need. – Tomer W Feb 07 '18 at 14:23

2 Answers2

2

Lookarounds do assert rather than consume characters. However capturing groups still store matched parts in them. Just put overlapping part inside a positive lookahead:

\/\*([a-zA-Z0-9]+)_START\*\/(?=(.*?)\/\*\1_END\*)

Live demo

revo
  • 47,783
  • 14
  • 74
  • 117
0
(?=\*([a-zA-Z0-9]+)_START\*\/(.*?)\/\*(\1)_END\*)

You will have to use lookahead and not capture anything.See demo.

https://regex101.com/r/vsA3ZU/1

vks
  • 67,027
  • 10
  • 91
  • 124