1

I'm creating a php script which uses a lot of rss sources. It puts the rss feeds in the database. There are multiple ways a thumbnail/image is put in an xml file. I have a problem with one specific way.

When an xml file contains something like this: Source

...
    <media:content type="image/jpeg" url="http://static3.hln.be/static/photo/2015/1/8/6/20150405151722/crop_7613016.jpg">
    <media:thumbnail url="http://static3.hln.be/static/photo/2015/1/8/6/20150405151722/crop_7613016.jpg"/>
    </media:content>
...

For some reason it looks like MagPie doesn't get that element. If I var_dump that item it looks like this:

...
    ["media"]=> array(1) { ["content"]=> string(1) " " } ["content"]=> string(1) " "
...

Anyone an idea how I can extract the thumbnail element? Thanks in advance.

UPDATE - (some example code):

    $rss = fetch_rss('http://www.hln.be/rss.xml');
    foreach ($rss->items as $item )
    { //loop through rss feed
        var_dump($item); //the var_dump without media elements

        return $array = $arrayName = array(
            'title' => $item['title'],
            'url' => $item['link'],
            ...);

        putDataInDatabase(); //put everything in the database
    }

A more full look at the var_dump

array(10) { ["title"]=> string(62) "Isinbayeva ..."" ["link"]=> string(23) "http://s.hln.be/2276986" ["description"]=> string(141) "Yelena Isinbayeva ..." ["pubdate"]=> string(29) "Sun, 05 Apr 2015 13:06:00 GMT" ["author"]=> string(8) "redactie" ["guid"]=> string(23) "http://s.hln.be/2276986" ["media"]=> array(1) { ["content"]=> string(1) " " } ["content"]=> string(1) " " ["summary"]=> string(141) "Yelena Isinbayeva wil... " ["date_timestamp"]=> int(1428239160) }
control-panel
  • 255
  • 6
  • 17
  • Which magpie version are you using? – hakre Apr 05 '15 at 14:06
  • I'm using magpierss-0.72. The most recent one I think. – control-panel Apr 05 '15 at 14:11
  • Could you please add a bit more context, like how you create the var_dump? are you using any specific methods to obtain that data you dump? if yes, which one? if no, also, which common one you use? – hakre Apr 05 '15 at 14:35
  • I updated the question. I var_dump the output of the fetch_rss function that comes with magpie. I can get all other data out of the $item. Just when its this particular xml syntax it doesn't work. – control-panel Apr 05 '15 at 14:50

1 Answers1

2

The elements are inside that array, e.g. you can access the URL of the content thumbnail like this:

$item['media']['content_thumbnail@url']

To put that into the perspective of your example:

$rss = fetch_rss('http://www.hln.be/rss.xml');
foreach ($rss->items as $item) { //loop through rss feed
    var_dump($item['media']['content_thumbnail@url']);
}

Gives the following output:

string(79) "http://static3.hln.be/static/photo/2015/2/10/14/20150405204108/crop_7613834.jpg"
string(78) "http://static0.hln.be/static/photo/2015/1/9/13/20150405201321/crop_7613833.jpg"
string(77) "http://static2.hln.be/static/photo/2015/7/0/8/20150405203748/crop_7613858.jpg"
string(77) "http://static2.hln.be/static/photo/2015/0/6/8/20150405200321/crop_7613813.jpg"
string(79) "http://static2.hln.be/static/photo/2015/17/6/10/20150405200509/crop_7613830.jpg"
string(77) "http://static1.hln.be/static/photo/2015/7/9/7/20150405195208/crop_7613782.jpg"
string(78) "http://static2.hln.be/static/photo/2015/0/15/7/20150405193052/crop_7613737.jpg"
...

This is the overall structure of the media element:

Array
(
    [content#] => 1
    [content@] => type,url
    [content@type] => image/jpeg
    [content@url] => http://static3.hln.be/static/photo/2015/2/10/14/20150405204108/crop_7613834.jpg
    [content] => 

    [content_thumbnail#] => 1
    [content_thumbnail@] => url
    [content_thumbnail@url] => http://static3.hln.be/static/photo/2015/2/10/14/20150405204108/crop_7613834.jpg
)
hakre
  • 193,403
  • 52
  • 435
  • 836
  • I would love that this was the case. But $item['media'] just returns: array(1) { ["content"]=> string(1) " " }. $item['media']['content'] returns string(1) " " and all the other elements that should be there are NULL. I found that more people are having trouble with this. Its even an open ticket on magpierss page on sourceforge (https://sourceforge.net/p/magpierss/support-requests/23/). So I use Simplepie for xml with that specific syntax. It's definitely not a fix, more like an workaround with a lot of overhead. If I have the time I will try to take a look in the magpies soucecode myself. – control-panel Apr 06 '15 at 09:36
  • Which PHP version are you using? Because I was taking your example verbatim here, so there must be some difference cause this. – hakre Apr 06 '15 at 09:37
  • currently version 5.6.2 – control-panel Apr 06 '15 at 09:41
  • I have no problems in PHP 5.4, 5.5 and 5.6. Just tried. There must be another issue. I'm using the ZIP package from Github (master). Perhaps that makes a difference? – hakre Apr 06 '15 at 10:02
  • I could this debug a little with a step-debugger. Magpie internally uses the [XML Parser Functions](https://php.net/manual/en/ref.xml.php). It's perhaps something with that extensions configuration. – hakre Apr 06 '15 at 10:18
  • I'm having the same problems with the github master. I uploaded the full code here: http://www.codeshare.io/FITDL – control-panel Apr 06 '15 at 10:28
  • It's in the end the same code as I use it here. Perhaps a different feed? – hakre Apr 06 '15 at 14:23
  • In the end I might use 200+ feeds. So I need everyone one of them to work. I have a workaround now by using the (slower) Simplepie library. I might come back to this problem in the end. Thanks for all the help @hakre. – control-panel Apr 06 '15 at 17:00
  • As said, I don't have the problem here with Magpie. So perhaps you won't need the fallback on another system. – hakre Apr 06 '15 at 17:39
  • Also I can recommend looking into there with a step-debugger like xdebug. The xml parser MagpieRss uses works with callbacks and you can easily see if it's the media element and then see why the element isn't added in your case. That should make it clear within minutes. I can't reproduce here. – hakre Apr 06 '15 at 18:48
  • 1
    It works! But I dont know really what I did, but I played around a bit with the php.ini file while installing xdebug. I will mark this answer as solved because it might help other people. – control-panel Apr 08 '15 at 10:14
  • now well that's nice to read! - is it perhaps possible to change the parser backend? something like a different library than expat? – hakre Apr 08 '15 at 11:53