2

Using PHP I'm trying to crawl a website page and then grab an image automatically.

I've tried the following:

<?php
$url = "http://www.domain.co.uk/news/local-news";

$str = file_get_contents($url);
?>

and

<?php
    $opts = array('http'=>array('header' => "User-Agent:Mozilla/5.0 (Windows NT 6.2) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.75 Safari/537.1\r\n"));
    $context = stream_context_create($opts);
    $header = file_get_contents('http://www.domain.co.uk/news/local-news',false,$context);
?>

and also

<?php
include('simple_html_dom.php');

$html = file_get_html('http://www.domain.co.uk/news/local-news');

$result = $html->find('section article img', 0)->outertext;
?>

but these all return with Internal Server Error. I can view the site perfectly in the browser but when I try to grab the page in PHP it fails.

Is there anything I can try?

pnuts
  • 58,317
  • 11
  • 87
  • 139
ngplayground
  • 20,365
  • 36
  • 94
  • 173
  • 1
    [enable error reporting](http://blog.flowl.info/2013/enable-display-php-errors/) – Daniel W. Apr 23 '14 at 10:18
  • possible duplicate of [PHP file\_get\_contents 500 Internal Server error](http://stackoverflow.com/questions/10524748/php-file-get-contents-500-internal-server-error) – majidarif Apr 23 '14 at 10:22

2 Answers2

2

Try below code: It will save content in local file.

<?php
$ch = curl_init("http://www.domain.co.uk/news/local-news");
$fp = fopen("localfile.html", "w");
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_exec($ch);
curl_close($ch);
fclose($fp);
?>

Now you can ready localfile.html.

Rahul Kaushik
  • 1,454
  • 8
  • 16
  • This creates a file successfully, but when I try to access it by adding the following under your code it overwrites the localfile.html and returns 500 Error `include('simple_html_dom.php'); $html = file_get_html('http://domain.com/build/wp-content/plugins/news-plugin/localfile.html'); $result = $html->find('.lead-story', 0)->outertext; echo $result;` – ngplayground Apr 23 '14 at 10:32
  • why in my server i'm using i'm recieving error 500? – Fernando Torres Oct 21 '19 at 14:24
1

Sometimes you might get an error opening an http URL with file_get_contents. even though you have set allow_url_fopen = On in php.ini

For me the the solution was to also set "user_agent" to something.

René Höhle
  • 26,716
  • 22
  • 73
  • 82
Xatenev
  • 6,383
  • 3
  • 18
  • 42