I'm having some trouble defining my question, sorry for the bad title.
I KNOW REGEX IS BAD TO PARSE HTML, BUT I DONT HAVE ANOTHER OPTION I'll just tell you what I've got.
I have the following string :
<span id=pink>some short text</span>
more text that can be a few lines
<span id=pink>again a short text</span>
More text that's abiding the same logic
<span id=pink>Repeat</span>...(more of these)
And this repeats itself multiple times.
Now I want to extract the text between text and to the next one. Meaning for the above example I'd like to return :
- more text that can be a few lines
- More text that's abiding the same logic
Now I've tried the following regex:
preg_match_all('/<span id=pink>.*?<\/span>(.*?)<span id=pink>.*?</span>/s',$data,$content);
Which partially works, however the problem is after finding the 1st match it doesn't detect the <span id=pink>
that closed the previous group as the opener of the next group. Meaning with the above example it will only find the first group, and with more "rows" in the string it will skip every other group.
EDIT:
- There are no new lines in the string, here just for simplifying.
- I know HTML parsing is better using a parser instead of regex, but sadly I need to solve this using regex.
How can I solve this? It feels like I'm missing some simple solution, is there a modifier perhaps that achieves this?
Thanks, Eric