4

im trying to read this xml: xml rss file

but with not success.. have this error

    Warning: simplexml_load_file(): http://noticias.perfil.com/feed/:232: parser error : CData section not finished <p>La sola lectura de los datos estadísticos desp in D:\xampp\FerreWoo\scrap-rvnot.php on line 43

    Warning: simplexml_load_file(): Isis, con lo que habría logrado un nuevo respaldo a sus proyectos terroristas. in D:\xampp\FerreWoo\scrap-rvnot.php on line 43

    Warning: simplexml_load_file(): ^ in D:\xampp\FerreWoo\scrap-rvnot.php on line 43

Im using this code:

   $feed = simplexml_load_file($urls, null, LIBXML_NOCDATA);

I try cURL too but the same erros still comming.

I know that maybe de xml file is incorrect... but there must be a way to read it, right?

Targaryen
  • 57
  • 1
  • 8
  • Very probably strongly related: https://stackoverflow.com/questions/2798108/cdata-section-not-finished-problem . When I retrieve the file, there's a backspace character right after `a sus proyectos terroristas.` – fvu May 29 '17 at 21:39
  • yes I see that...still cant fix it with the link you provide. – Targaryen May 29 '17 at 22:21

1 Answers1

7

You have some invalid characters on that XML. Try this code below

$url    = 'http://noticias.perfil.com/feed/';
$html   = file_get_contents($url);
$invalid_characters = '/[^\x9\xa\x20-\xD7FF\xE000-\xFFFD]/';
$html = preg_replace($invalid_characters, '', $html);

$xml = simplexml_load_string($html);

//test purpose part 
$encode = json_encode($xml);
$decode = json_decode($encode, true);
print_r($decode);

Hope it helps

Prashant Kanse
  • 772
  • 11
  • 23
rheeantz
  • 960
  • 8
  • 12