Is there any regex engine which would allow me to match multiple heredoc strings on an expression? E.g., as one would write in Ruby:
f <<FOO, 10, <<BAR, 20
some text
FOO
some more text
BAR
I tought of using backrefs and recursive calling in Perl's flavor, but couldn't manage to make the cross-serial dependencies work (i.e., couldn't reverse the captured backrefs, as FOO
should match before BAR
). I also thought of balancing groups on .Net, where I can reverse the stack by using lookaheads (I know, this is a terrible hack), like this:
(?:(?<x>foo|bar|baz)|\s)+(?(x)|(?!))\s*(?(x)(?=(.*?)(?<-x>(?<y>\k<x>)))){3}(?(x)(?!))(?:(?(y)(?<-y>\k<y>))|\s)+(?(x)(?!))(?(y)(?!))
(Click here to test it.)
That matches foo bar baz foo bar baz
, but then I have to add a manual counter (the {3}
), since the lookahead won't repeat with +
since it doesn't consume any input I assume. Thus this won't work on arbitrary cases (but it was close!). I could, of course, replace that by {1000}
or any other big number, and that would answer my question, but I wonder if there are other ways.
Acknowledgment: I do understand it is not a good idea to match such kind of construct with regexes. I am doing research work on such, and I want to find out if it is possible. If it is, please do not use it in production code.