3

I'm using a PHP template engine I've written some time ago. It relies on regexes to create a cached PHP file. Some examples of the syntax:

{$foo} - regular variable
{$foo.bar} - variable foo that uses the array key 'bar'
{$foo|uppercase} - modifier 'uppercase' that takes 'foo' and applies some method to it

{iteration:users}
    Hi there {$users.name}
{/iteration: users}

The list goes on... There's quite an amount of nasty regexes involved to parse all this. Note that an iteration can be inside another iteration and so on.

Recently I've been seeing template engines like twig, smarty3, that use a template lexer. I have a few questions about this: - In general isn't the lexer way slower than using a few regexes to create a cached php template? - Are there good resources on how to write your own lexer to interpret some sort of (template) language (I couldn't find anything I understand on google) - Should I keep using regexes or is a lexer something worth exploring?

tshepang
  • 12,111
  • 21
  • 91
  • 136
Bauffman
  • 84
  • 6
  • 1
    I know this isn't on topic, but what's the point of your templating engine if it employs logic? You have iterations, you have variables, you have arrays, you have modifiers that make variables uppercase.. why add another layer on top of PHP? Even the syntax looks similar to PHP's. – N.B. Aug 18 '11 at 12:07
  • PHP is the best template engine for PHP. In PHP you have iterations, variables, arrays, functions, conditionals, third-party libraries, connections to the DB, etc. Syntax matches PHP's exactly :) And speaking seriously, all that template engines is good, but right-implemented MVC is much maintanable and scalable. – J0HN Aug 18 '11 at 18:23
  • 2
    At the company where I work, templates are written by designers. At first we tried using php itself, but trust me, that's mostly not an option. Most designers can write html/css and some robust template language. – Bauffman Aug 18 '11 at 19:59

1 Answers1

5

I suggest writing Parsing expression grammars (PEGs), and see this answer for a PEG library in PHP.

PEGs are very much alike Regular Expressions, they are greedy by nature, and never ambiguous: great for a Domain Specific Language (DSL).

In general isn't the lexer way slower than using a few regexes to create a cached php template?

No: the speed of regular expressions are implementation dependent of the Regular Expression engine. In general, every time you use a Regular Expression, it needs to be parsed itself, and then with the given model, it must use a general matcher, that works with all Regular Expressions possible.

Given a lexer, you fine-tune the matcher: you get a specific matcher, which only works for your predefined grammar. One gain is in the bootstrap case: no need to compile the Regular Expression. Another gain is in it's lesser complexity, due to it's specific matcher, which tends to run faster.

Are there good resources on how to write your own lexer to interpret some sort of (template) language (I couldn't find anything I understand on google)?

Lexers are quite complex. To write your own you will have to know stuff about state machines, regular grammar, context-free or non-context-free grammers, etc.

It requires some fundamental computer science knowledge before it's easy to grasp though.

Should I keep using regexes or is a lexer something worth exploring?

Worth noting is the error-catching capabilities of well engineered lexers (e.g. an error message: "expected ;, but found ), on line 64:38.")

Community
  • 1
  • 1
Pindatjuh
  • 10,550
  • 1
  • 41
  • 68