My question is fairly straightforward, even if the purpose it will serve is pretty complicated. I will use a simple example:
AzzAyyAxxxxByyBzzB
So normally I would want to get everything between A
and B
. However, because some of the content between the first A
and the last B
(one pair) contains additional AB
pairs I need to push back the end of the match. (Not sure if that last part made sense).
So what I'm looking for is some RegEx that would allow me to have the following output:
Match 1
Group 1: AzzAyyAxxxxByyBzzB
Group 2: zzAyyAxxxxByyBzz
Then I would match it again to get:
Match 2
Group 1: AyyAxxxxByyB
Group 2: yyAxxxxByy
Then finally again to get:
Match 3
Group 1: AxxxxB
Group 2: xxxx
Obviously if I try (A(.*?)B)
on the whole input I get:
Match x
Group 1: AzzAyyAxxxxB
Group 2: zzAyyAxxxx
Which is not what I'm looking for :)
I hope this makes sense. I understand if this can't be done in RegEx, but I thought I would ask some of you regex wizards before I give up on it and try something else. Thanks!
Additional Info:
The project I'm working on is written in Java.
One other problem is that I'm parsing a document which could contain something like this:
AzzAyyAxxxxByyBzzB
Here is some unrelated stuff
AzzAyyAxxxxByyBzzB
AzzzBxxArrrBAssssB
And the top AB
pairs needs to be separate from the bottom AB
pairs