0

I'm trying to use Casimir et Hippolyte's pattern (Here) to wrap HTML tags in string.

$html = <<<EOD
    $str
EOD;

    $pattern = <<<'EOD'
    ~
    (?(DEFINE)
        (?<self>    < [^\W_]++ [^>]* > )
        (?<comment> <!-- (?>[^-]++|-(?!->))* -->)
        (?<cdata>   \Q<![CDATA[\E (?>[^]]++|](?!]>))* ]]> )
        (?<text>    [^<]++ )
        (?<tag>
            < ([^\W_]++) [^>]* >
            (?> \g<text> | \g<tag> | \g<self> | \g<comment> | \g<cdata> )*
            </ \g{-1} >
        )
    )
    # main pattern
    (?: \g<tag> | \g<self> | \g<comment> | \g<cdata> )+
    ~x
EOD;

After implementing this method, I got an error Compilation failed: assertion expected after (?( at offset 6. What's wrong with this pattern?

Community
  • 1
  • 1
Lewis
  • 14,132
  • 12
  • 66
  • 87

1 Answers1

1

After some researches, it seems that PCRE versions < 7.2 have this kind of bug with the DEFINE syntax.

You can write the same pattern like that:

$pattern = <<<'EOD'
~
(?:
    (?<tag>
            < ([^\W_]++) [^>]* >
            (?> (?<text> [^<]++ )
              | \g<tag>
              | (?<self> < [^\W_]++ [^>]* > )
              | (?<comment> <!-- (?>[^-]++|-(?!->))* -->)
              | (?<cdata> \Q<![CDATA[\E (?>[^]]++|](?!]>))* ]]>)
            )*
            </ \g{2} > # second group from pattern start (<tag> is 1st)
    )
  | \g<self> | \g<comment> | \g<cdata>
)+
~x
EOD;
Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125