0

I have to parse this template file ($html) :

{$myFirstVariable}
{$myMainVar:MYF1,"x\:x\,x",2:MYF2:MYF3,false}
{$myLastVariable:trim}

Following, my php parser :

$regexp = '#{\$(?<name>.+?)(\:(?<modifiers>.+?))?}#';

preg_replace_callback($regexp, 'separateVariable', $html);

function separateVariable($matches) {
    $varname = $matches['name'];

    print $varname."\n";

    if (isset($matches['modifiers'])) {
        $modifiers = $matches['modifiers'];

        $modifiers = preg_split('#(?<!\\\):#', $modifiers);
        $parsed = array();

        foreach ($modifiers as $modifier) {
            $modifier = preg_split('#(?<!\\\),#', $modifier);
            $parsed[array_shift($modifier)] = $modifier;
        }

        // parsed[myFuncName] = Array(2ndArg, 3rdArg)

        print_r($parsed);
    }

    print "\n";
}

All working except i've to escape ':' and ',' in {$myMainVar:...} with an '\'.

Do you have any solution to free me up of '\' ?

Thanks.

bernedef
  • 699
  • 2
  • 12
  • 22
  • 1
    Regex is not for parsing languages. Let go of the idea that you can do this with an elaborate regex that just currently escapes you. It will not be possible. Write an actual parser. – Tomalak Sep 05 '10 at 13:43
  • See also http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – John Carter Sep 05 '10 at 18:22
  • @therefromhere: This question is not about parsing HTML. – Gumbo Sep 06 '10 at 10:19
  • Yeah, I know but I thought it was worth referring to the canonical "don't regexs for this" answer. – John Carter Sep 06 '10 at 10:46

3 Answers3

0

Regex won't help you too much with this because the data has multiple levels. It might be easier to split the data first by : and then parse the result (i.e. now split substr,1,2 by ,). The problem is that you would need multiple Regexes. Regexes don't return arrays and they don't do multidimensional matches; they are used for parsing fields from data whose format is known ahead of time.

Chris Laplante
  • 29,338
  • 17
  • 103
  • 134
0

Regular expression can't return nested array, besides, what you are trying to seems more that text processing (substr, explode..) than having to use regular expression. Also, your example doesn't make it clear how is the standard processing of the input works.

I suggest: Building a recursive function that deals with the logic of your unserialiing process, that function will use switch cases and string manipulation functions.

aularon
  • 11,042
  • 3
  • 36
  • 41
0

If it helps you:

$string = '{$myVariable:trim:substr,1,2}';

if (preg_match("#^\{\\$([a-zA-Z]+)\:([a-z]+)\:([a-z]+)\,([0-9]+)\,([0-9]+)\}$#", $string, $m)){
$result = <<<RESULT
Array (
    "{$m[1]}",
    Array (
        "{$m[2]}" => Array(),
        "{$m[3]}" => Array(
            {$m[4]},
            {$m[5]}
        )
    )
)
RESULT;
}
echo $result;
Ion Br.
  • 2,598
  • 1
  • 19
  • 25