0

I writing simple templating language, but I have problem with nested statements. For example I using for Foreach, this regular expression:

preg_match('/\{foreach +\$(.*?)\}(.*?){\/foreach\}/sui', $this->content, $matches);

Everything work fine, but when I nesting foreach in foreach, I getting error. Because the regular expression get first opening tag but second closing tag.

{foreach $XY}

{foreach $YX} {/foreach}

{/foreach}

How can I resolve this? Thank you!

Martin
  • 490
  • 5
  • 16
  • 3
    Regular expressions are typically not good at handling nested constructs. – Explosion Pills Mar 14 '13 at 15:10
  • 2
    You should read this: http://stackoverflow.com/a/1732454/1225541 – alestanis Mar 14 '13 at 15:11
  • 1
    Regular expressions are not appropriate for parsing a language. You have just discovered the reason why. Also, I would discourage you from writing your own templating language, when many good solutions already exist. –  Mar 14 '13 at 15:11
  • Okey, but now I cannot stop writing, what is the prefered way how to do that? Parsing line by line? – Martin Mar 14 '13 at 15:16
  • 1
    @MartinPernica parsing token by token more appropiately, because you can have nested elements in a single line too. – Carlos Campderrós Mar 14 '13 at 15:19

1 Answers1

0

As said in the comments, regex are not optimal for parsing nested contexts. (Especially if you want to build a parse tree of the result.)

Never the less you can match it:

(?xi) {foreach \s++ \$\w++ } (?: [^{}]++ | (?R) )*+ {/foreach}

Could be used as:

preg_match_all(',{foreach \s++ \$\w++ } (?: [^{}]++ | (?R) )*+ {/foreach},xi', $str, $matches);

This will give you the outermost matches.

Qtax
  • 33,241
  • 9
  • 83
  • 121
  • This will break if the templating language has any other constructs using braces, like `{if}`. Advanced regex features like recursion are cool, and you could technically even build a parser with them, but to do so would be to force non-regex paradigms into regex syntax. –  Mar 14 '13 at 15:38
  • @dan1111, regex aren't regular. This isn't formal language theory. – Qtax Mar 14 '13 at 15:40
  • I'm not talking about theory. My point is that if you try to do something very complicated with regexes, you are going beyond their strength. –  Mar 14 '13 at 15:44
  • @dan1111, define "going beyond their strength". But I disagree. This isn't complicated. ;) – Qtax Mar 14 '13 at 15:49