83

I'm using file_get_contents() to access a URL.

file_get_contents('http://somenotrealurl.com/notrealpage');

If the URL is not real, it return this error message. How can I get it to error gracefully so that I know that the page doesn't exist and act accordingly without displaying this error message?

file_get_contents('http://somenotrealurl.com/notrealpage') 
[function.file-get-contents]: 
failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found 
in myphppage.php on line 3

for example in zend you can say: if ($request->isSuccessful())

$client = New Zend_Http_Client();
$client->setUri('http://someurl.com/somepage');

$request = $client->request();

if ($request->isSuccessful()) {
 //do stuff with the result
}
Clay
  • 4,700
  • 3
  • 33
  • 49
sami
  • 7,515
  • 8
  • 33
  • 37
  • try using stream context: http://stackoverflow.com/questions/21800276/is-it-possible-to-get-404-page-content-using-fopen-in-php , file_get_contents uses fopen under the hood. – rsk82 Feb 15 '14 at 19:24

8 Answers8

130

You need to check the HTTP response code:

function get_http_response_code($url) {
    $headers = get_headers($url);
    return substr($headers[0], 9, 3);
}
if(get_http_response_code('http://somenotrealurl.com/notrealpage') != "200"){
    echo "error";
}else{
    file_get_contents('http://somenotrealurl.com/notrealpage');
}
ynh
  • 2,919
  • 1
  • 19
  • 18
  • 7
    This technique is preferable to mine if you need to know why the request failed, ie. checking for status code (404 may need to be handled differently to 503 for instance). If not, it potentially introduces two requests and the ignore is then preferable. – Orbling Dec 05 '10 at 09:24
  • 1
    While this is a good solution, it doesn't consider other http error codes like 500. So, a simple tweak could be like: `$headers = get_headers($uri);` `if (stripos($headers[0], '40') !== false || stripos($headers[0], '50') !== false) {` `...handle errors...` `}` – YOMorales Nov 30 '12 at 16:05
  • 18
    I think this code is wrong. You should call `get_headers` only if `file_get_contents` returns `false`. It does not make much sense to call every URL twice. Except you expect that most of your URLs will fail. Its really sad that `$http_response_header` is empty if the status 4xx or 5xx happens. By that we wouldn't need `get_headers` at all. – mgutt Apr 14 '15 at 15:10
  • 1
    This code is kind of wasteful as it will make the same request twice. You'd be better off checking `$http_response_header` - https://www.php.net/manual/en/reserved.variables.httpresponseheader.php – donatJ Aug 30 '21 at 19:56
  • I couldn't agree more with mgutt - this is a serious flaw in the design of file_get_contents. It is cutting efficiancy in half. I suspect curl is a better way to go, just in terms of efficiancy. – Daniel Bengtsson Jan 14 '22 at 12:38
77

With such commands in PHP, you can prefix them with an @ to suppress such warnings.

@file_get_contents('http://somenotrealurl.com/notrealpage');

file_get_contents() returns FALSE if a failure occurs, so if you check the returned result against that then you can handle the failure

$pageDocument = @file_get_contents('http://somenotrealurl.com/notrealpage');

if ($pageDocument === false) {
    // Handle error
}
Orbling
  • 20,413
  • 3
  • 53
  • 64
  • 3
    I don't want to just suppress the errors. I want to know if the url is valid. – sami Dec 05 '10 at 09:22
  • Note that if the server is down the function could block for a while. – Alex Jasmin Dec 05 '10 at 09:22
  • @sami When you say 'valid', do you mean a valid URL, or "works"? – Orbling Dec 05 '10 at 09:25
  • @Alexandre See this post for the asynchronous question: http://stackoverflow.com/questions/962915/how-do-i-make-an-asynchronous-get-request-in-php - Note that it will still block on DNS failures due to the DNS lookup blocking. – Orbling Dec 05 '10 at 09:28
  • Error suppression still gives the error that the original question reported. Weird, I know, but it happens. – YOMorales Nov 30 '12 at 15:32
  • 2
    PErfect solution for me. Thanks¡ – Jam Jun 09 '18 at 07:47
  • 1
    You literally saved my day. I wasted my time trying to implement other solutions, until I tried yours. Thanks a million – Vickar Sep 08 '18 at 16:16
34

Each time you call file_get_contents with an http wrapper, a variable in local scope is created: $http_response_header

This variable contains all HTTP headers. This method is better over get_headers() function since only one request is executed.

Note: 2 different requests can end differently. For example, get_headers() will return 503 and file_get_contents() would return 200. And you would get proper output but would not use it due to 503 error in get_headers() call.

function getUrl($url) {
    $content = file_get_contents($url);
    // you can add some code to extract/parse response number from first header. 
    // For example from "HTTP/1.1 200 OK" string.
    return array(
            'headers' => $http_response_header,
            'content' => $content
        );
}

// Handle 40x and 50x errors
$response = getUrl("http://example.com/secret-message");
if ($response['content'] === FALSE)
    echo $response['headers'][0];   // HTTP/1.1 401 Unauthorized
else
    echo $response['content'];

This aproach also alows you to have track of few request headers stored in different variables since if you use file_get_contents() $http_response_header is overwritten in local scope.

tronman
  • 9,862
  • 10
  • 46
  • 61
Grzegorz
  • 3,538
  • 4
  • 29
  • 47
  • 1
    This is perfect, the fact that it saves the additional request gets my +1.. I'm dealing with generating a cache of tens of thousands of URL's.. so to have to double up on requests would just be ridiculous. – jenovachild Mar 24 '15 at 03:13
16

While file_get_contents is very terse and convenient, I tend to favour the Curl library for better control. Here's an example.

function fetchUrl($uri) {
    $handle = curl_init();

    curl_setopt($handle, CURLOPT_URL, $uri);
    curl_setopt($handle, CURLOPT_POST, false);
    curl_setopt($handle, CURLOPT_BINARYTRANSFER, false);
    curl_setopt($handle, CURLOPT_HEADER, true);
    curl_setopt($handle, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($handle, CURLOPT_CONNECTTIMEOUT, 10);

    $response = curl_exec($handle);
    $hlength  = curl_getinfo($handle, CURLINFO_HEADER_SIZE);
    $httpCode = curl_getinfo($handle, CURLINFO_HTTP_CODE);
    $body     = substr($response, $hlength);

    // If HTTP response is not 200, throw exception
    if ($httpCode != 200) {
        throw new Exception($httpCode);
    }

    return $body;
}

$url = 'http://some.host.com/path/to/doc';

try {
    $response = fetchUrl($url);
} catch (Exception $e) {
    error_log('Fetch URL failed: ' . $e->getMessage() . ' for ' . $url);
}
nikc.org
  • 16,462
  • 6
  • 50
  • 83
  • Aye, the curl library is a lot better - I never fetch URLs with `file_get_contents()` personally, I do not like using stream wrappers like that, feels a bit flaky. – Orbling Dec 05 '10 at 11:55
7

You may add 'ignore_errors' => true to options:

$options = [
    'http' => [
        'ignore_errors' => true,
        'header' => "Content-Type: application/json\r\n",
    ],
];
$context = stream_context_create($options);
$result = file_get_contents('http://example.com', false, $context);

In that case you will be able to read a response from the server.

William Desportes
  • 1,412
  • 1
  • 22
  • 31
alniks
  • 385
  • 2
  • 4
  • 12
5

Simple and functional (easy to use anywhere):

function file_contents_exist($url, $response_code = 200)
{
    $headers = get_headers($url);

    if (substr($headers[0], 9, 3) == $response_code)
    {
        return TRUE;
    }
    else
    {
        return FALSE;
    }
}

Example:

$file_path = 'http://www.google.com';

if(file_contents_exist($file_path))
{
    $file = file_get_contents($file_path);
}
tfont
  • 10,891
  • 7
  • 56
  • 52
5

To avoid double requests as commented by Orbling on the answer of ynh you could combine their answers. If you get a valid response in the first place, use that. If not find out what the problem was (if needed).

$urlToGet = 'http://somenotrealurl.com/notrealpage';
$pageDocument = @file_get_contents($urlToGet);
if ($pageDocument === false) {
     $headers = get_headers($urlToGet);
     $responseCode = substr($headers[0], 9, 3);
     // Handle errors based on response code
     if ($responseCode == '404') {
         //do something, page is missing
     }
     // Etc.
} else {
     // Use $pageDocument, echo or whatever you are doing
}
Community
  • 1
  • 1
Kuijkens
  • 156
  • 1
  • 4
3
$url = 'https://www.yourdomain.com';

Normal

function checkOnline($url) {
    $headers = get_headers($url);
    $code = substr($headers[0], 9, 3);
    if ($code == 200) {
        return true;
    }
    return false;
}

if (checkOnline($url)) {
    // URL is online, do something..
    $getURL = file_get_contents($url);     
} else {
    // URL is offline, throw an error..
}

Pro

if (substr(get_headers($url)[0], 9, 3) == 200) {
    // URL is online, do something..
}

Wtf level

(substr(get_headers($url)[0], 9, 3) == 200) ? echo 'Online' : echo 'Offline';
SixSense
  • 143
  • 1
  • 10