2

While including the simple HTML DOM library, I get the warnings:

Warning: file_get_contents() [function.file-get-contents]: php_network_getaddresses: getaddrinfo failed: No such host is known. in C:\xampp\htdocs\simple_html_dom.php on line 70

Warning: file_get_contents(http://www.google.com/) [function.file-get-contents]: failed to open stream: php_network_getaddresses: getaddrinfo failed: No such host is known. in C:\xampp\htdocs\simple_html_dom.php on line 70

The line 70 in simple_html_dom.php file(downloaded from http://sourceforge.net/projects/simplehtmldom/files/latest/download) is

  $contents = file_get_contents($url, $use_include_path, $context, $offset);

Also 1 error:

Fatal error: Call to a member function find() on a non-object in C:\xampp\htdocs\domdoc2.php on line 15

where line 15 of the code(below) is

foreach($html->find('img') as $element) 

The web page i was referring in my code below is google.com Code follows:

     <?php

include('simple_html_dom.php');
$html = new simple_html_dom();  
$html = file_get_html('http://www.google.com/');
// Find all images 
foreach($html->find('img') as $element) 
       echo $element->src . '<br>';

// Find all links 
foreach($html->find('a') as $element) 
       echo $element->href . '<br>';
?>

What am I doing wrong??

saur
  • 103
  • 1
  • 3
  • 12

2 Answers2

2

This is because your host was unable to resolve DNS, this happens when simplehtmldom uses file_get_contents instead of curl. PHP Simple HTML DOM Parser is a great HTML parsing PHP class BUT it is slow since it uses file_get_contents (which is disabled on almost all configurations) instead of cURL (4-5 times faster and with lots of options, almost every server has it).

Only file_get_contents is replaced so you can safely overwrite previous version and everything will work as before, only faster

Link to source code: http://webarto.com/static/download/simple_html_dom.rar

//output should be

/intl/en_ALL/images/srpr/logo1w.png
http://www.google.com/webhp?hl=en&tab=ww
http://www.google.com/imghp?hl=en&tab=wi
http://maps.google.com/maps?hl=en&tab=wl
https://play.google.com/?hl=en&tab=w8
http://www.youtube.com/?tab=w1
http://news.google.com/nwshp?hl=en&tab=wn
https://mail.google.com/mail/?tab=wm
https://docs.google.com/?tab=wo
http://www.google.com/intl/en/options/
https://www.google.com/calendar?tab=wc
http://translate.google.com/?hl=en&tab=wT
http://www.google.com/mobile/?tab=wD
http://books.google.com/bkshp?hl=en&tab=wp
https://www.google.com/offers/home?utm_source=xsell&utm_medium=products&utm_campaign=sandbar&tab=wG#!details
https://wallet.google.com/manage/?tab=wa
http://www.google.com/shopping?hl=en&tab=wf
http://www.blogger.com/?tab=wj
http://www.google.com/reader/?hl=en&tab=wy
http://www.google.com/finance?tab=we
http://picasaweb.google.com/home?hl=en&tab=wq
http://video.google.com/?hl=en&tab=wv
http://www.google.com/intl/en/options/
https://accounts.google.com/ServiceLogin?hl=en&continue=http://www.google.com/
http://www.google.com/preferences?hl=en
/preferences?hl=en
/url?sa=p&pref=ig&pval=3&q=http://www.google.com/ig%3Fhl%3Den%26source%3Diglk&usg=AFQjCNFA18XPfgb7dKnXfKz7x7g1GDH1tg
http://www.google.com/history/optout?hl=en
/advanced_search?hl=en
/language_tools?hl=en
/intl/en/ads/
/services/
https://plus.google.com/116899029375914044550
/intl/en/about.html
/intl/en/policies/

However if you are completely new to HTML parsing in PHP Please Consider reading : How do you parse and process HTML/XML in PHP?

Community
  • 1
  • 1
Eswar Rajesh Pinapala
  • 4,841
  • 4
  • 32
  • 40
  • Thanks for that..well it may sound too amateurish but could you please tell what should be the output..I am getting a blank screen.. – saur Jun 22 '12 at 07:33
  • It should print out all the links(a->href) and images(img->src) as you requested in the script.I tried the exact same script and i updated the ans with the output . – Eswar Rajesh Pinapala Jun 22 '12 at 07:38
  • U might be getting a blank screen because you might not have enabled curl in your machine. Enable curl. If you dont know how, let me know your operating system. – Eswar Rajesh Pinapala Jun 22 '12 at 07:41
  • open up your php.ini file... and search for curlxx.dll and remove the ; infront of the file name, save the php.ini file. restart apache. are yousing xampp or wampp? – Eswar Rajesh Pinapala Jun 22 '12 at 07:50
  • follow this : http://www.tildemark.com/programming/php/enable-curl-with-xampp-on-windows-xp.html restart ur apache and rerun the script – Eswar Rajesh Pinapala Jun 22 '12 at 08:37
  • @EswarRajeshPinapala I was trying to get this script http://stackoverflow.com/questions/10035954/php-get-all-the-images-from-url-which-width-and-height-200-more-quicker running, but faced the same error as the post on this page. I tried your solution, but it still fails. Would it be possible to throw some insight? – aVC Oct 28 '12 at 04:13
1

That's not in any way related to simple_html_dom. Your server has no internet access and it fails to resolve google.com. Check the DNS settings and maybe the firewall.

Tom van der Woerdt
  • 29,532
  • 7
  • 72
  • 105