Consider XSLT, the sibling to XPath, as you are essentially transforming original XML, not really parsing select values. With XSLT, you would need no foreach
loop and can adequately handle namespaces.
In fact as shown below XSLT is the fastest of aforementioned methods ( SimpleXML
querying and XPath
evaluating) using posted XML wrapped in a <feed ...>
root:
Simple XML (from @IMSoP)
$time_start = microtime(true);
$xml = file_get_contents('YoutubeFeed.xml');
$document = new SimpleXMLElement($xml);
define('NS_ATOM', 'http://www.w3.org/2005/Atom');
define('NS_MEDIA', 'http://search.yahoo.com/mrss/');
foreach ($document->children(NS_ATOM)->entry as $entry) {
echo "<item>".PHP_EOL;
echo "<title>".$entry->title."</title>".PHP_EOL;
echo "<link>".$entry->link->attributes(null)->href."</link>".PHP_EOL;
echo "<image>".$entry->children(NS_MEDIA)->thumbnail->attributes()->url."</image>".PHP_EOL;
echo "<description>".$entry->children(NS_MEDIA)->description."</description>".PHP_EOL;
echo "<guid>".$entry->children(NS_MEDIA)->guid."</guid>".PHP_EOL;
echo "<views>".$entry->children(NS_MEDIA)->statistics->attributes()->views."</views>".PHP_EOL;
echo "<pubDate>".$entry->published."</pubDate>".PHP_EOL;
echo "</item>".PHP_EOL;
}
Timing
echo "SimpleXML: " . (microtime(true) - $time_start) ."\n";
# SimpleXML: 0.0014688968658447
XPATH (from @ThW)
$time_start = microtime(true);
$xml = file_get_contents('YoutubeFeed.xml');
$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
$xpath->registerNamespace('atom', 'http://www.w3.org/2005/Atom');
$xpath->registerNamespace('media', 'http://search.yahoo.com/mrss/');
foreach ($xpath->evaluate('//atom:entry') as $entry) {
echo "<item>".PHP_EOL;
echo "<title>". $xpath->evaluate('string(atom:title)', $entry)."</title>".PHP_EOL;
echo "<link>". $xpath->evaluate('string(atom:link/@href)', $entry)."</link>".PHP_EOL;
echo "<image>". $xpath->evaluate('string(media:thumbnail/@url)', $entry)."</image>".PHP_EOL;
echo "<description>". $xpath->evaluate('string(media:description)', $entry)."</description>".PHP_EOL;
echo "<guid>". $xpath->evaluate('string(media:guid)', $entry)."</description>".PHP_EOL;
echo "<views>".$xpath->evaluate('string(media:statistics/@views)', $entry)."</guid>".PHP_EOL;
echo "<pubDate>". $xpath->evaluate('string(atom:pubdate)', $entry)."</views>".PHP_EOL;
echo "</item>".PHP_EOL;
}
Timing
echo "XPATH: " . (microtime(true) - $time_start) ."\n";
# XPATH: 0.0012829303741455
XSLT
$time_start = microtime(true);
$xml = file_get_contents('YoutubeFeed.xml');
$document = new DOMDocument();
$document->loadXml($xml);
$xslstr = '<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/"
exclude-result-prefixes="atom media">
<xsl:output version="1.0" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*"/>
<xsl:template match="feed">
<xsl:apply-templates select="atom:entry"/>
</xsl:template>
<xsl:template match="atom:entry">
<item>
<title><xsl:value-of select="atom:title"/></title>
<link><xsl:value-of select="atom:link/@href"/></link>
<image><xsl:value-of select="atom:thumbnail/@url"/></image>
<description><xsl:value-of select="media:description"/></description>
<guid><xsl:value-of select="media:guid"/></guid>
<views><xsl:value-of select="media:statistics/@views"/></views>
<pubDate><xsl:value-of select="atom:pubdate"/></pubDate>
</item>
</xsl:template>
</xsl:stylesheet>';
$xsl = new DOMDocument;
$xsl->loadXML($xslstr);
// Configure the transformer
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl);
// Transform XML source
$newXML = $proc->transformToXML($document);
// Echo string output
echo $newXML;
Timing
echo "XSLT: " . (microtime(true) - $time_start) ."\n";
# XSLT: 0.00098896026611328
Even with more <entry>
nodes, copying tag and children to 500 lines, XSLT scales much better. Below units are in seconds:
# SimpleXML: 0.62154388427734
# XPATH: 0.68382000923157
# XSLT: 0.011976957321167