PHP getting page title of XML Feed

Question

I'm trying to get the page title from XML Feeds.

I'm using http://feeds.gawker.com/lifehacker/full as an example and using the below code works with other sites but for Lifehacker it seems to ignore the closing </title> tag and console.log shows the entire content of the xml feed from after the opening <title>

function getTitle($Url){
        $str = file_get_contents($Url);
        if(strlen($str)>0){
            preg_match("/\<title\>(.*)<\/title\>/",$str,$title);
            return $title[1];
        }
    }

$feed = 'http://feeds.gawker.com/lifehacker/full';
$pagetitle = getTitle($feed);

Thanks

score 1 · Accepted Answer · answered Sep 09 '13 at 10:53

1

Don't use regex for parsing XML or HTML pages. Try this instead. Simpler and neater:

$feed = simplexml_load_file('feed.xml');

var_dump((string)$feed->channel->title);

answered Sep 09 '13 at 10:53

silkfire

24,585
15
82
105

@Beardy Accept if you liked the answer ;) – silkfire Sep 09 '13 at 11:06
I had to wait some time to accept ;) done now – ngplayground Sep 09 '13 at 11:57

score 0 · Answer 2 · answered Sep 09 '13 at 10:56

Personally I would recommend against using regular expression for parsing XML documents. It's simply not suited for that.

Instead have a look at SimpleXML or DOM

Now, what is wrong with your regular expression is that the star is greedy by default

preg_match("/\<title\>(.*?)<\/title\>/",$str,$title);

will get you what you are after. But keep in mind that your code will only return the first title element in the document.

More on regular expressions at this excellent reference site

http://www.regular-expressions.info/

PHP getting page title of XML Feed

2 Answers2