8

I have an xml file

<?xml version="1.0" encoding="utf-8"?>
<xml>
    <events date="01-10-2009" color="0x99CC00" selected="true"> 
       <event>
            <title>You can use HTML and CSS</title>
            <description><![CDATA[This is the description ]]></description>
        </event>
    </events>
</xml>

I used xpath and and xquery for parsing the xml.

$xml_str = file_get_contents('xmlfile');
$xml = simplexml_load_string($xml_str);
if(!empty($xml))
{
    $nodes = $xml->xpath('//xml/events');
}

i am getting the title properly, but iam not getting description.How i can get data inside the cdata

hakre
  • 193,403
  • 52
  • 435
  • 836
Warrior
  • 5,168
  • 12
  • 60
  • 87

3 Answers3

11

SimpleXML has a bit of a problem with CDATA, so use:

$xml = simplexml_load_file('xmlfile', 'SimpleXMLElement', LIBXML_NOCDATA);
if(!empty($xml))
{
    $nodes = $xml->xpath('//xml/events');
}
print_r( $nodes );

This will give you:

Array
(
    [0] => SimpleXMLElement Object
        (
            [@attributes] => Array
                (
                    [date] => 01-10-2009
                    [color] => 0x99CC00
                    [selected] => true
                )

            [event] => SimpleXMLElement Object
                (
                    [title] => You can use HTML and CSS
                    [description] => This is the description 
                )

        )

)
ocodo
  • 29,401
  • 18
  • 105
  • 117
  • 6
    Wrong! SimpleXML has no problem with CDATA, and this is a persistent myth which should not be perpetuated. It is only `print_r` which cannot see the CDATA, because SimpleXML does not actually store its data as a "real" PHP object, it just coughs it up on demand. – IMSoP Dec 11 '12 at 23:54
10

You are probably being misled into thinking that the CDATA is missing by using print_r or one of the other "normal" PHP debugging functions. These cannot see the full content of a SimpleXML object, as it is not a "real" PHP object.

If you run echo $nodes[0]->Description, you'll find your CDATA comes out fine. What's happening is that PHP knows that echo expects a string, so asks SimpleXML for one; SimpleXML responds with all the string content, including CDATA.

To get at the full string content reliably, simply tell PHP that what you want is a string using the (string) cast operator, e.g. $description = (string)$nodes[0]->Description.

To debug SimpleXML objects and not be fooled by quirks like this, use a dedicated debugging function such as one of these: https://github.com/IMSoP/simplexml_debug

IMSoP
  • 89,526
  • 13
  • 117
  • 169
2

This could also be another viable option, which would remove that code and make life a little easier.

$xml = str_replace("<![CDATA[", "", $xml);
$xml = str_replace("]]>", "", $xml);
vr_driver
  • 1,867
  • 28
  • 39