1

I am stuck with parsing a string containing key-value pairs with operators in between (like the one below) in PHP. I am planning to user regex to parse it (I am not good at it though).

key: "value" & key2 : "value2" | title: "something \"here\"..." &( key: "this value in paranthesis" | key: "another value")

Basically the units in the above block are as follows

  1. key - Anything that qualifies to be a javascript variables.
  2. value - Any string long or short but enclosed in double quotes ("").
  3. pair - (key:value) A key and value combined by colon just like in javascript objects.
  4. operator - (& or |) Simply indicating 'AND' or 'OR'.

There can be multiple blocks nested within prantheses ( and ).

Being inspired from Matt (http://stackoverflow.com/questions/2467955/convert-javascript-regular-expression-to-php-pcre-expression) I have used the following regular expressions.

$regs[':number'] = '(?:-?\\b(?:0|[1-9][0-9]*)(?:\\.[0-9]+)?(?:[eE][+-]?[0-9]+)?\\b)';
$regs[':oneChar'] = '(?:[^\\0-\\x08\\x0a-\\x1f\"\\\\]|\\\\(?:[\"/\\\\bfnrt]|u[0-9A-Fa-f]{4}))';
$regs[':string'] = '(?:\"'.$regs[':oneChar'].'*\")';
$regs[':varName'] = '\\$(?:'.$regs[':oneChar'].'[^ ,]*)';
$regs[':func'] = '(?:{[ ]*'.$regs[':oneChar'].'[^ ]*)';
$regs[':key'] = "({$regs[':varName']})";
$regs[':value'] = "({$regs[':string']})";
$regs[':operator'] = "(&|\|)";
$regs[':pair'] = "(({$regs[':key']}\s*:)?\s*{$regs[':value']})";

if(preg_match("/^{$regs[':value']}/", $query, $matches))
{
  print_r($matches);
}

When executing the above, PHP throws an error near the IF condition

Warning: preg_match() [function.preg-match]: Unknown modifier '\' in /home/xxxx/test.xxxx.com/experiments/regex/index.php on line 23

I have tried to preg_match with :string and :oneChar but still I get the same error. Therefor I feel there is something wrong in the :oneChar reg ex. Kindly help me in resolving this issue.

Goje87
  • 2,839
  • 7
  • 28
  • 48
  • 1
    Never use regular expressions for parsing! – SK-logic Mar 31 '11 at 16:38
  • 1
    because you could [have a breakdown like this SO user](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454). – gideon Mar 31 '11 at 16:42
  • Hi @SK-logic and @giddy, As this is the first time I am working with parsing, I am not aware of different methods used for it. I would love to know what are the other ways that are good for parsing. – Goje87 Mar 31 '11 at 18:17
  • there are many parser generators for PHP, including https://github.com/maetl/php-peg – SK-logic Mar 31 '11 at 19:23

1 Answers1

0

I see at least one error in the second regular expression ($regs[':oneChar']). There is a forward slash in it. And it is conflicting with the forward slashes being used in preg_match as delimiters. Use preg_match("@^{$regs[':value']}@", $query, $matches) instead.

You may also need to use preg_quote on the input string.

$query = preg_quote($query, '/');

Beyond that, I would run each of your regular expressions one at a time to see which one is throwing the error.

Goje87
  • 2,839
  • 7
  • 28
  • 48
Nick Clark
  • 4,439
  • 4
  • 23
  • 25
  • Hey Nick, Good find... :) The forward slash is the culprit. But the problem is not that it is not escaped. Instead it is the string that was being passed to the preg_match. It also had '/' as delimiters and so it conflicted with the '/' in $regs[':oneChar']. I changed the delimiters to '@' and it worked fine. – Goje87 Mar 31 '11 at 18:44