1

I want to get an <ul>s innerHTML from another domains html with php.

$mhraWebUygulamasi =file_get_contents('http://www.mhra.gov.uk/Safetyinformation/Safetywarningsalertsandrecalls/index.htm');
$doc = new DOMDocument();
$doc->loadHTML($mhraWebUygulamasi);
$doc->preserveWhiteSpace=false;

But befor coding further, i got this warning message.

Warning: DOMDocument::loadHTML(): Unexpected end tag : fragmentinstance in Entity, line: 123 in C:\xampp\htdocs\YeBeSis\mhra.php on line 4

Line 4 is $doc->loadHTML($mhraWebUygulamasi); Other line number may target urls html code. How to handle target URL gently and load it to a DOM container? Where did i go wrong?

caglaror
  • 459
  • 1
  • 13
  • 28
  • 1
    Does this help? [http://stackoverflow.com/questions/6090667/php-domdocument-errors-warnings-on-html5-tags](http://stackoverflow.com/questions/6090667/php-domdocument-errors-warnings-on-html5-tags) – gmartellino Mar 15 '13 at 23:07
  • Thank you. This link was very explainatory. However, i should handle with target urls html plus supress(disable) warning message. I will give a try to @Sheikh Heera's solution. – caglaror Mar 16 '13 at 18:29

2 Answers2

1

Useing PHP Simple HTML DOM Parser you can do it easily, Just download the simple_html_dom.php file from here and use it as follows.

include('simple_html_dom.php');
$html = file_get_html('http://www.mhra.gov.uk/Safetyinformation/Safetywarningsalertsandrecalls/index.htm');

Then loop, for example, to get all ul tags and it's content you can use following loop

foreach($html->find('ul') as $li){
    echo $li->innertext.'<br />';
}

Or use this to get only the ul with class name subnav2

foreach($html->find('ul.subnav2') as $li){
    echo $li->innertext.'<br />';
}

Output of above code (5 li tags)

  • Medical Device Alerts
  • Field Safety Notices (FSNs)
  • Drug Alerts
  • Safety warnings and messages for medicines
  • UK Public Assessment Reports on drug safety
  • It's to easy to use and the selecting syntax is just like jQuery, read the documentation to learn more.

    The Alpha
    • 143,660
    • 29
    • 287
    • 307
    1

    The message you're getting is just a warning; not an error -- the DOM is still being populated.

    However, it's warning you that that incoming HTML is incorrect, and thus it cannot guarantee that the DOM it generates will be entirely as intended by the author.

    But in many cases, it really doesn't matter, so if you're okay with that, feel free to ignore the warning and carry on regardless.

    In that case, all you'll need to do is suppress the warnings from being shown.

    This is discussed in more detail here: Disable warnings when loading non-well-formed HTML by DomDocument (PHP)

    Hope that helps.

    Community
    • 1
    • 1
    Spudley
    • 166,037
    • 39
    • 233
    • 307