Regular expressions are "greedy" by default - they take the longest possible match, not the shortest.
In this case, your problem appears to be the .*
token, which basically translates to "match anything at all". This will operate by immediately matching the entire remainder of the string, and then back-tracking until the subsequent part of the regex can be satisfied. The result is that everything up to the last {% something %}
tag is considered your final match.
The simplest solution to this is to just use .*?
, which means "match anything, but don't be greedy about it". This will start by matching nothing, and then work its way forwards until the pattern can be matched, likely giving you the result you wanted.
However, as noted in comments, a tokenising parser might be more appropriate for this kind of task: track through the string, dividing it up into a sequence of tag, not-tag, tag, not-tag, then match up the tags afterwards. This will allow you more flexibility in your syntax, and less head-scratching with complexities like nested tags, or detecting incorrectly formatted input.