0

I'm using the example of this post How to get content from another page but I need to get just "SUPERMAN" from website with this format:

<td headers="superHero">SUPERMAN</td>
<td headers="country">USA</td>

the code:

$url = "http://www.otherweb.com";
$curl = curl_init($url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
$output = curl_exec($curl);
curl_close($curl);


$DOM = new DOMDocument;
$DOM->loadHTML( $output);

//get all td
//$items = $DOM->getElementsByTagName('td'); 
$items = $DOM->getElementsByID('superHero');

//display all text
 for ($i = 0; $i < $items->length; $i++)
 echo $items->item($i)->nodeValue . "<br/>";

Thanks!!!

Community
  • 1
  • 1
  • 2
    `getElementsByID` is wrong. Uncomment the previous line. – Rahil Wazir Jun 23 '14 at 14:50
  • 1
    Start from [here](http://www.php.net/manual/en/class.domdocument.php) – hindmost Jun 23 '14 at 14:52
  • In addition to the comment above, `getElementById()` matches DOM elements' `id` values, not these `headers` attributes. – esqew Jun 23 '14 at 14:53
  • a DOM element has to be **UNIQUE** in the entire document. `getElement **S* by Id()` is therefore redundant - there can never be more than one element with a particular ID, so there is no 's' version of the function. it is just `getElementById()`, singular. – Marc B Jun 23 '14 at 14:54

1 Answers1

1

First, you can skip the curl part. DOMDocument has the method loadHTMLFile() to load even remote html files. Just use:

$DOM = new DOMDocument();
$DOM->loadHTMLFile($url);
// If the remote page might not being valid against HTML standards,
// you might want to use the "silence operator" : @
@$DOM->loadHTMLFile($url);

If you want to select an element by it's attribute value, you use XPath:

$selector = new DOMXPath($DOM);
$element = $selector->query('//td[@headers="superHero"]')->item(0);
hek2mgl
  • 152,036
  • 28
  • 249
  • 266