1

I'm trying to parse some html that is inside an external .dat file.

I would normaly use the follwing code:

$html = new DOMDocument();
$html->loadHTMLFile('http://www.bvl.com.pe/includes/cotizaciones_todas.dat');
$xpath = new DOMXPath($html);
$path = '/somepath';
$nodelist = $xpath->query($path);
echo $nodelist->item(0)->nodeValue;

But I'm getting this error:

DOMDocument::loadHTMLFile(): htmlParseEntityRef: expecting ';' in http://www.bvl.com.pe/includes/cotizaciones_todas.dat, line: 15

I know that the problem is the loadHTMLFile, I tried using load or loadXML but it's not working neither. Any help would be appriciated.


UPDATE

To solve the problem I had to handle the errors using libxml_use_internal_errors(TRUE). Now I've a new problem, I want to count how many <tr> tags are inside the table. I'm using the following code:

$html = new DOMDocument();
libxml_use_internal_errors(TRUE);
$html->loadHTMLFile('http://www.bvl.com.pe/includes/cotizaciones_todas.dat');
libxml_clear_errors();
$xpath = new DOMXPath($html);
$tbody = $html->getElementsByTagName('tbody')->item(0);
$path = 'count(tr)';
$trCount = $xpath->evaluate($path,$tbody);

But I'm getting this error msg: PHP Catchable fatal error: Argument 2 passed to DOMXPath::evaluate() must be an instance of DOMNode, null given I already used the same code with other files and everything worked fine, but in this case it's not working, maybe because the html is broken?

mat
  • 2,412
  • 5
  • 31
  • 69
  • 2
    Look at your server's error log. it'll have more details about the 500. – Marc B Dec 09 '12 at 00:23
  • I updated my post with the log file. – mat Dec 09 '12 at 00:43
  • The html output is not valid (927 Errors, 1167 warning(s) according to http://validator.w3.org/check), check the top voted answer:http://stackoverflow.com/questions/3893375/how-can-i-scrape-a-website-with-invalid-html – AaronSantos Dec 09 '12 at 00:59
  • @AaronSantos you were right, the HTML was broken. However I've anew problem now (see update). – mat Dec 09 '12 at 01:36

0 Answers0