0

How to use php to retrieve the particular element and attribute from a remote HTML page?

For instance, if the element and attribute to be retrieved had the format:

<a href="/dir/someid/" class="ccc">

Any help would be greatly appreciated.

The code method that will be used:


<?php
   $file = fopen ("http://www.example.com/", "r");
   if (!$file) {
       echo "<p>Unable to open remote file.\n";
       exit;
   }
   while (!feof ($file)) {
       $line = fgets ($file, 1024);
       /* This only works if the title and its tags are on one line */
       if (preg_match ("@\<title\>(.*)\</title\>@i", $line, $out)) {
           $title = $out[1];
           break;
       }
   }
   fclose($file);
   ?>

65535
  • 523
  • 1
  • 10
  • 32

1 Answers1

0

Solution:

        $homepage = file_get_contents ("https://www.somedomain.com");
        $doc = new DOMDocument;
        $doc->preserveWhiteSpace = false;
        @$doc->loadHTML($homepage);
        $xpath = new DOMXpath($doc);
        $results = $xpath->query("//div[@class='some-class']");

        foreach($results as $contextNode) {

            $text = $xpath->evaluate("string(./a[1])",$contextNode);
            $href = $xpath->evaluate("string(./a[1]/@href)",$contextNode);

            }
65535
  • 523
  • 1
  • 10
  • 32