0

I am trying to retrieve some meta data included into a SimpleXMLElement. I am using XPATH and I struggle to get the value that interests me.

Here is an extract of the webpage header (from : http://www.wayfair.de/CleverFurn-Couchtisch-Abby-69318X2-MFE2223.html)

Do you know how I could retrieve all xmlns data in an array containing :

1) og:type 2) og:url 3) og:image .... x) og:upc


<meta xmlns:og="http://opengraphprotocol.org/schema/" property="og:title" content="CleverFurn Couchtisch &quot;Abby&quot;" />


And here's my php code

<?php
$html = file_get_contents("http://www.wayfair.de/CleverFurn-Couchtisch-Abby-69318X2-MFE2223.html");
$doc = new DOMDocument();
$doc->strictErrorChecking = false;
$doc->recover=true;
@$doc->loadHTML("<html><body>".$html."</body></html>");

$xpath = new DOMXpath($doc);
$elements = $xpath->query("//*/meta[@property='og:url']");

if (!is_null($elements)) {
foreach ($elements as $element) {
echo "<br/>[". $element->nodeName. "]";
var_dump($element);
  $nodes = $element->childNodes;
  foreach ($nodes as $node) {
     echo $node->nodeValue. "\n";
     }
   }
 }
?>
justberare
  • 1,003
  • 1
  • 9
  • 29

1 Answers1

1

Just found the answer :

How to get Open Graph Protocol of a webpage by php?

<?php
$html = file_get_contents("http://www.wayfair.de/CleverFurn-Couchtisch-Abby-69318X2-MFE2223.html");
libxml_use_internal_errors(true); // Yeah if you are so worried about using @ with warnings
$doc = new DomDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$query = '//*/meta[starts-with(@property, \'og:\')]';
$metas = $xpath->query($query);
foreach ($metas as $meta) {
    $property = $meta->getAttribute('property');
    $content = $meta->getAttribute('content');
    $rmetas[$property] = $content;
}
var_dump($rmetas);
?>
Community
  • 1
  • 1
justberare
  • 1,003
  • 1
  • 9
  • 29