2

So I am building a parser for stylesheets. And in my CSS file, I will parse something like this:

body {
   background-image: &gradient(#1073b8, &saturate(#1073b8, -20), 40);
}

The result would be a value of something like this:

body {
   background-image: url(/cache/graident12345.png);
}

Which is the output of my &gradient() function, that I catch and execute. But one of the arguments to &gradient() is the result of &saturate(), so it's hierarchical, and I need preg_replace() to handle them inner to outer, but how do I do that? So first catch and parse &saturate(), return a color HEX value, which in turn will be in the string when parsing &gradient(). My current lookup code looks like this:

if (preg_match("/\&(gradient|saturate)\((.*?)\)/", $value, $m)){
   $arg = explode(", ", $m[2]); # splits argument to the function
   if ($m[1] == "saturate"){
       $value = str_replace($m[0], saturate($arg[1], $arg[2]);
   }
}

And I'll add functions as I make them available. Alternatively, catch the function name as well in the regexp.

Do I make myself clear? Any ideas? :)

Alan Moore
  • 73,866
  • 12
  • 100
  • 156
Sandman
  • 2,323
  • 5
  • 28
  • 34
  • 1
    That would require full parsing using tokens, more than the regex can handle. You can use recursive patterns if you make non-ambiguous delimiters either side of the the function to achieve the nesting. eg. `«gradient(#1073b8, «saturate(#1073b8, -20)», 40)»` – Orbling Sep 18 '12 at 16:18
  • Uh oh, looks like the pumping lemma [http://en.wikipedia.org/wiki/Pumping_lemma] rears its head again! You cannot find matching parentheses in regular language notation. You are going to have to hack around this or use a grammar. – Bob Fincheimer Sep 18 '12 at 16:18
  • @BobFincheimer: Bear in mind PCRE supports recursive patterns `(?R)`, which go beyond *regular* language expressions, despite the name. – Orbling Sep 18 '12 at 16:20
  • 1
    http://php.net/manual/en/function.preg-replace-callback.php – desimusxvii Sep 18 '12 at 16:21
  • @Orbling: yup, tried to say regular language notation to be clear, but there are facilities in perl regex for higher languages in the chomsky hierarchy. You can also use php as thing to bridge the gap too – Bob Fincheimer Sep 18 '12 at 16:22
  • @BobFincheimer: Aye, still fairly limited in scope - the suggested grammar would be beyond it. Just quickly shoving in the "PCRE is not actually regular" argument before the anti-regex gang arrive. ;-) – Orbling Sep 18 '12 at 16:26
  • @Orbling: haha, so true! – Bob Fincheimer Sep 18 '12 at 16:49
  • 1
    Just curious, why not use an existing CSS Preprocessor? I get that you're also generating images in this example... but writing an extension for Sass to generate images seems a lot easier than writing a brand new preprocessor from scratch. – cimmanon Sep 18 '12 at 17:18
  • Basically because I have thousands of CSS files out there that are parsed by my parser, which started its life long before sass (or similar preprocessors) existed. I would love to use a stock preprocessor in conjunction with my own parser, but I haven't found a way to do it, yet. – Sandman Sep 18 '12 at 18:48
  • @Orbling: How would those help? There would still be the matter of inner/outer, no? – Sandman Sep 18 '12 at 18:51
  • @Sandman: Have a read about nested matching with recursive regular expressions in PHP (ie. with PCRE). There are plenty of good questions/answers on here about it. eg. [(i)](http://stackoverflow.com/questions/8440911/recursive-php-regex) [(ii)](http://stackoverflow.com/a/1896668/438971) - or you can [read the manual](http://php.net/manual/en/regexp.reference.recursive.php). Parentheses are used elsewhere, I was introducing alternate characters that can be uniquely matched in pairs, enabling recursion - with the `(?R)` pattern in PCRE. A common use is bbcode tags, if you're searching. – Orbling Sep 18 '12 at 18:59

1 Answers1

0

You can do some interesting parsing using regex in combination with a recursive function. I have mocked up something that will work for your scenario, feel free to tweak it to your needs.

http://ideone.com/66tV0

If you need any explanation of what is going on, let me know, but it basically creates a recursive structure through a recursive call to a function that regexes for the pattern &functionname([args...]).

Bob Fincheimer
  • 17,978
  • 1
  • 29
  • 54