2

I'm reading a rss feed using simple code:

 <?php
$homepage = file_get_contents('http://www.forbes.com/news/index.xml');
$movies = new SimpleXMLElement($homepage);
echo '<pre>';
print_r($movies);
?>

and the output like this: SimpleXMLElement Object ( [@attributes] => Array ( [version] => 2.0 )

[channel] => SimpleXMLElement Object
    (
        [title] => SimpleXMLElement Object
            (
            )

        [link] => SimpleXMLElement Object
            (
            )

        [description] => SimpleXMLElement Object
            (
            )

        [language] => en-us
        [copyright] => Copyright 2009 Forbes.com LLC
        [item] => Array
            (
                [0] => SimpleXMLElement Object
                    (
                        [title] => SimpleXMLElement Object
                            (
                            )

                        [link] => SimpleXMLElement Object
                            (
                            )

                        [author] => SimpleXMLElement Object
                            (
                            )

                        [pubDate] => Sat, 05 Nov 2011 07:17:21 GMT
                        [description] => SimpleXMLElement Object
                            (
                            )

                    )

and more.... but when I View source of this page I have Info like this:

 <rss version="2.0"><channel><title><![CDATA[Forbes.com: News]]></title><link><!   [CDATA[http://www.forbes.com]]></link><description><![CDATA[News and reports from Forbes.com]]></description><language>en-us</language><copyright>Copyright 2009 Forbes.com LLC</copyright><item><title><![CDATA[Benicio Del Toro Offered Villain Role In "Star Trek" Sequel - Is It Khan?]]></title><link><![CDATA[http://www.forbes.com/sites/markhughes/2011/11/05/benicio-del-toro-offered-villain-role-in-star-trek-sequel-is-it-khan/?feed=rss_home]]></link><author><![CDATA[Mark Hughes]]></author><pubDate>Sat, 05 Nov 2011 07:17:21 GMT</pubDate><description><![CDATA[Variety reports that actor Benicio del Toro is being offered the role of villain in the upcoming sequel to director J.J. Abram?s 2009 blockbuster franchise-reboot movie Star Trek. So far, Abrams and crew have kept a tight lid on details about the new Paramount film, and the identity of the main villain is a closely ...]]></description>

how can I read and store CDATA value in mydatabase .

omnath
  • 509
  • 8
  • 25

2 Answers2

13

Tell SimpleXML to convert CDATA into normal texts:

$homepage = 'http://www.forbes.com/news/index.xml';
$movies = simplexml_load_file($homepage, "SimpleXMLElement", LIBXML_NOCDATA);

That should do it for you, using simplexml_load_file instead of file_get_contents.

Related Answer: Removing cdata in simplehtmldom.

Community
  • 1
  • 1
hakre
  • 193,403
  • 52
  • 435
  • 836
  • sit,its show error Warning: simplexml_load_file() expects parameter 2 to be a class name derived from SimpleXMLElement, '16384' given in D:\wamp\www\test_om\store-feed\cdata.php on line 3 – omnath Nov 05 '11 at 12:51
  • '; print_r($movies); ?> – omnath Nov 05 '11 at 12:52
  • @omnath: That was a mistake in my answer, one parameter was missing. I updated it and fixed that. Next to that, it's not using `file_get_contents` any longer. – hakre Nov 05 '11 at 12:54
  • thanks sir, now its working, but we can use file_get_contents if we use $homepage = file_get_contents('http://www.forbes.com/news/index.xml'); $xml = simplexml_load_string($homepage,'SimpleXMLElement', LIBXML_NOCDATA); – omnath Nov 05 '11 at 13:00
  • Sure, that works as well. I thought it's better to have it in one function call. But sure, that works as well. There are often multiple ways that lead to the same :) – hakre Nov 05 '11 at 13:01
4

The above "fix" will work, but is entirely unnecessary.

SimpleXML objects contain a lot of "magic", and are not designed to be viewed using print_r; the CDATA is safely in your object, but won't show up unless you ask for it in the right way.

If you run echo (string)$movies->channel->title; you should get "Forbes.com: News" as you would expect.

Note the (string), which tells PHP to explicitly convert the "magic" SimpleXMLElement into a string. If you don't do this, you'll actually be getting another SimpleXMLElement object back - otherwise my example wouldn't work because $movies->channel would be a string.

It's good practice to always use (string) when accessing elements or attributes from SimpleXML, as some functions will choke if they are expecting a string and you give them a SimpleXML object instead, and serializing or session storage will certainly fail.

IMSoP
  • 89,526
  • 13
  • 117
  • 169
  • 1
    If you run `json_encode` on SimpleXMLElement or you cast the element to array, then this still makes a difference. – hakre Oct 12 '14 at 00:22
  • @hakre That's true, but most of the time you probably don't need or want to do that. SimpleXML is intended as an interactive API for traversing XML structures; since neither PHP nor JSON's native structures can easily represent XML's structures, the best is usually to pluck out the parts you want with `(string)`, or reserialize a section with `->asXML()`. – IMSoP Oct 12 '14 at 00:50
  • Most of the time you don't need this. It's merely for the folks that I have XML I want Array type of folks. Just stumbled over the one or other of your posts which actually reminded me to that detail. BTW I've used a slightly different way to explain here: http://stackoverflow.com/a/26316558/367456 - we should create a reference question for the topic that explains this nicely and shows the different ways how to deal with it. You can explain it very well. – hakre Oct 12 '14 at 08:27