7

I've found a great tutorial on how to accomplish most of the work at:

https://www.developphp.com/video/PHP/simpleXML-Tutorial-Learn-to-Parse-XML-Files-and-RSS-Feeds

but I can't understand how to extract media:content images from the feeds. I've read as much info as i can find, but i'm still stuck.

ie: How to get media:content with SimpleXML this suggests using:

foreach ($xml->channel->item as $news){
    $ns_media = $news->children('http://search.yahoo.com/mrss/');
    echo $ns_media->content; // displays "<media:content>"}

but i can't get it to work.

Here's my script and feed i'm trying to parse:

<?php
$html = "";
$url = "http://rssfeeds.webmd.com/rss/rss.aspx?RSSSource=RSS_PUBLIC";
$xml = simplexml_load_file($url);
for($i = 0; $i < 10; $i++){
    $title = $xml->channel->item[$i]->title;
    $link = $xml->channel->item[$i]->link;
    $description = $xml->channel->item[$i]->description;
    $pubDate = $xml->channel->item[$i]->pubDate;

    $html .= "<a href='$link'><h3>$title</h3></a>";
    $html .= "$description";
    $html .= "<br />$pubDate<hr />";
}
echo $html;
?>

I don't know where to add this code into the script to make it work. Honestly, i've browsed for hours, but couldn't find working script that would parse media:content.

Can someone help with this?

========================

UPDATE:

Thanx to fusion3k, i got the final code working:

<?php
$html = "";
$url = "http://rssfeeds.webmd.com/rss/rss.aspx?RSSSource=RSS_PUBLIC";
$xml = simplexml_load_file($url);
for($i = 0; $i < 5; $i++){

    $image = $xml->channel->item[$i]->children('media', True)->content->attributes();
    $title = $xml->channel->item[$i]->title;
    $link = $xml->channel->item[$i]->link;
    $description = $xml->channel->item[$i]->description;
    $pubDate = $xml->channel->item[$i]->pubDate;

    $html .= "<img src='$image' alt='$title'>";
    $html .= "<a href='$link'><h3>$title</h3></a>";
    $html .= "$description";
    $html .= "<br />$pubDate<hr />";
}
echo $html;
?>

Basically all i needed was this simple line:

$image = $xml->channel->item[$i]->children('media', True)->content->attributes();

Can't believe it was so hard for non techie to find this info online after reading dozens of posts and articles. Well, hope this will serve well for other folks like me :)

Community
  • 1
  • 1
reizer
  • 241
  • 3
  • 12

2 Answers2

10

To get 'url' attribute, use ->attribute() syntax:

$ns_media = $news->children('http://search.yahoo.com/mrss/');

/* Echoes 'url' attribute: */
echo $ns_media->content->attributes()['url'];
// in php < 5.5: $attr = $ns_media->content->attributes(); echo $attr['url'];

/* Catches 'url' attribute: */
$url = $ns_media->content->attributes()['url']->__toString();
// in php < 5.5: $attr = $ns_media->content->attributes(); $url = $attr['url']->__toString();

Namespaces explanation:

The ->children() arguments is not the URL of your XML, it is a Namespace URI.

XML namespaces are used for providing uniquely named elements and attributes in an XML document:

<xxx>       Standard XML tag
<yyy:zzz>   Namespaced tag
 └┬┘ └┬┘
  │   └──── Element Name
  └──────── Element Prefix (Namespace Identifier)

So, in your case, <media:content> is the “content” element of Namespace “media”. Namespaced elements must be have an associated Namespace URI, as attribute of a parent node or — most commonly — of the root element: this attribute has the form xmlns:yyy="NamespaceURI" (in your case xmlns:media="http://search.yahoo.com/mrss/" as attribute of root node <rss>).

Ultimately, the above $news->children( 'http://search.yahoo.com/mrss/' ) means “retrieve all children elements with http://search.yahoo.com/mrss/ as Namespace URI; an alternative — most intelligible — syntax is: $news->children( 'media', True ) (True means “regarded as a prefix”).

Returning to the code in example, the generic syntax to retrieve all first item's children with prefix media is:

$xml = simplexml_load_file( 'http://rssfeeds.webmd.com/rss/rss.aspx?RSSSource=RSS_PUBLIC' );
$xml->channel->item[0]->children( 'http://search.yahoo.com/mrss/' );

or (identical result):

$xml = simplexml_load_file( 'http://rssfeeds.webmd.com/rss/rss.aspx?RSSSource=RSS_PUBLIC' );
$xml->channel->item[0]->children( 'media', True );

Your new code:

If you want to show the <media:content url> thumbnail for each element in your page, modify the original code in this way:

(...)
$pubDate = $xml->channel->item[$i]->pubDate;
$image   = $xml->channel->item[$i]->children( 'media', True )->content->attributes()['url'];
// in php < 5.5:
// $attr  = $xml->channel->item[$i]->children( 'media', True )->content->attributes();
// $image = $attr['url'];

$html   .= "<a href='$link'><h3>$title</h3></a>";
$html   .= "<img src='$image' alt='$title'>";
(...)
fusion3k
  • 11,568
  • 4
  • 25
  • 47
  • thanx for guiding me, I'm not a coder, so it's a bit hard for me to get this working. So if i put this line: 16. $ns_media = $xml->children('http://rssfeeds.webmd.com/rss/rss.aspx?RSSSource=RSS_PUBLIC'); 17. $imgurl = $ns_media->content->attributes()['url']->__toString(); it gives me error on 17. line of unidentified var. – reizer Mar 09 '16 at 08:24
  • @reizer Where do you want output the url (note that it is a image thumbnail)? – fusion3k Mar 09 '16 at 08:33
  • hi @fusion3k, Thank You, your explanation is really deep and informative. It seems much easier now. I don't know why i couldn't find such easily laid out info on the internet before. This code finally works. One quick question though, in your line of code you have: $image = $xml->channel->item[$i]->children( 'media', True )->content->attributes()['url']; It was giving me error in DW although it worked. I've removed the ['url'] part from the end, it gives no error and still works. So that part is not necessary i assume ? – reizer Mar 09 '16 at 10:46
  • Probably this issue is due to PHP version. Try this: `$attr = $xml->channel->item[$i]->children( 'media', True )->content->attributes(); $image = $attr['url'];`. Without specifying `['url']` the command works, but it catch first attribute. So, if sooner or later xml producer add additional attribute before 'url' (unlikely but not impossible), you will retrieve this new attribute instead of 'url'. I've been clear? – fusion3k Mar 09 '16 at 10:56
  • Fabulous, this works just fine! thanx for the help! – reizer Mar 09 '16 at 11:00
5

Simple example for newbs like me:

$url = "https://www.youtube.com/feeds/videos.xml?channel_id=UCwNPPl_oX8oUtKVMLxL13jg";
$rss = simplexml_load_file($url);

foreach($rss->entry as $item) {

  $time = $item->published;
  $time = date('Y-m-d \ H:i', strtotime($time));

  $media_group = $item->children( 'media', true );
  $title = $media_group->group->title;
  $description = $media_group->group->description;
  $views = $media_group->group->community->statistics->attributes()['views'];
}
echo $time . ' :: ' . $title . '<br>' . $description . '<br>' . $views . '<br>';
Pastuh
  • 376
  • 2
  • 12