0
$table = $PageXpath->query('//table[10]'); 
$data['tr'] = $table->item(0)->nodeValue;
print_r($data);

Above works. It gives the the text of everything in that table.

However, it does not give me the tr and td html mark up. I need all of the TR elements, marked up. That way I can start dissecting them and passing them each in to a variable.

I need to target that specific table, then do a for each TR, do X with the first <td> then Y with the second <td> then Z with the third <td>.

Jens Erat
  • 37,523
  • 16
  • 80
  • 96
JML1179
  • 35
  • 6

1 Answers1

1

The nodeValue of a node does not contain any tags - it is the result of the concatenation of all DOMText nodes within it.

I think you're looking for something like this:

// $Dom is the original DOMDocument object used when creating $PageXpath

$table = $PageXpath->query('//table[10]')->item(0);
$xml = $Dom->saveXML($table);
print_r($xml);

Be aware that most HTML isn't valid XML - using DOMDocument and XPath to parse HTML is very unreliable.

George Brighton
  • 5,131
  • 9
  • 27
  • 36
  • What are my other options? I am building a scraper to scrape a site. I thought it was about the only option I had. – JML1179 Sep 15 '13 at 16:07
  • Have a look at this: http://stackoverflow.com/questions/3577641/how-do-you-parse-and-process-html-xml-in-php – George Brighton Sep 15 '13 at 16:08
  • George, I am new to stack overflow. I tried to help people in several categories but I can't post more then two links, and post once per three minutes. How do I get reputation built up? – JML1179 Sep 15 '13 at 16:12
  • As you contribute more to the SO community, you'll be rewarded with more privileges. Did that link help? – George Brighton Sep 15 '13 at 16:27