2

On my site I have a couple links for downloading a file, but I want to make a php script that check if the download link is still online. This is the code I'm using:

$cl = curl_init($url);  
curl_setopt($cl,CURLOPT_CONNECTTIMEOUT,10);
curl_setopt($cl,CURLOPT_HEADER,true);
curl_setopt($cl,CURLOPT_NOBODY,true);
curl_setopt($cl,CURLOPT_RETURNTRANSFER,true);

if(!curl_exec($cl)){
    echo 'The download link is offline';
    die();
}

$code = curl_getinfo($cl, CURLINFO_HTTP_CODE);
if($code != 200){
    echo 'The download link is offline';
}else{
    echo 'The download link is online!';
}

The problem is that it downloads the whole file which makes it really slow, and I only need to check the headers. I saw that curl has an option CURLOPT_CONNECT_ONLY, but the webhost I'm using has php version 5.4 which doesn't have that option. Is there any other way I can do this?

  • 2
    is this helpful? http://stackoverflow.com/questions/2280394/how-can-i-check-if-a-url-exists-via-php – Dan Jun 14 '14 at 15:45
  • Thank you, the function get_headers() did it :) –  Jun 14 '14 at 15:47

2 Answers2

2

CURLOPT_CONNECT_ONLY would be good, but it’s only available in PHP 5.5 & abodes. So instead, try using get_headers. Or even use another method using fopen, stream_context_create & stream_get_meta_data. First the get_headers method:

// Set a test URL.
$url = "https://www.google.com/";

// Get the headers.
$headers = get_headers($url);

// Check if the headers are empty.
if(empty($headers)){
  echo 'The download link is offline';
  die();
}

// Use a regex to see if the response code is 200.
preg_match('/\b200\b/', $headers[0], $matches);

// Act on whether the matches are empty or not.
if(empty($matches)){
  echo 'The download link is offline';
}
else{
  echo 'The download link is online!';
}

// Dump the array of headers for debugging.
echo '<pre>';
print_r($headers);
echo '</pre>';

// Dump the array of matches for debugging.
echo '<pre>';
print_r($matches);
echo '</pre>';

And the output of this—including the dumps used for debugging—would be:

The download link is online!

Array
(
    [0] => HTTP/1.0 200 OK
    [1] => Date: Sat, 14 Jun 2014 15:56:28 GMT
    [2] => Expires: -1
    [3] => Cache-Control: private, max-age=0
    [4] => Content-Type: text/html; charset=ISO-8859-1
    [5] => Set-Cookie: PREF=ID=6e3e1a0d528b0941:FF=0:TM=1402761388:LM=1402761388:S=4YKP2U9qC6aMgxpo; expires=Mon, 13-Jun-2016 15:56:28 GMT; path=/; domain=.google.com
    [6] => Set-Cookie: NID=67=Wun72OJYmuA_TQO95WXtbFOK5g-xU53PQZ7dAIBtzCaBWxhXzduHQZfBVPf4LpaK3MVH8ZKbrBIc3-vTKuMlEnMdpWH0mcft5pA_0kCoe4qolDmednpPJqezZF_HyfXD; expires=Sun, 14-Dec-2014 15:56:28 GMT; path=/; domain=.google.com; HttpOnly
    [7] => P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
    [8] => Server: gws
    [9] => X-XSS-Protection: 1; mode=block
    [10] => X-Frame-Options: SAMEORIGIN
    [11] => Alternate-Protocol: 443:quic
)

Array
(
    [0] => 200
)

And here is another method using fopen, stream_context_create & stream_get_meta_data. The benefit of this method is it gives you a bit more info on what actions were taken to fetch the URL in addition to the headers:

// Set a test URL.
$url = "https://www.google.com/";

// Set the stream_context_create options.
$opts = array(
  'http' => array(
    'method' => 'HEAD'
   )
);

// Create context stream with stream_context_create.
$context  = stream_context_create($opts);

// Use fopen with rb (read binary) set and the context set above.
$handle = fopen($url, 'rb', false, $context);

// Get the headers with stream_get_meta_data.
$headers = stream_get_meta_data($handle);

// Close the fopen handle.
fclose($handle);

// Use a regex to see if the response code is 200.
preg_match('/\b200\b/', $headers['wrapper_data'][0], $matches);

// Act on whether the matches are empty or not.
if(empty($matches)){
  echo 'The download link is offline';
}
else{
  echo 'The download link is online!';
}

// Dump the array of headers for debugging.
echo '<pre>';
print_r($headers);
echo '</pre>';

And here is the output of that:

The download link is online!

Array
(
    [wrapper_data] => Array
        (
            [0] => HTTP/1.0 200 OK
            [1] => Date: Sat, 14 Jun 2014 16:14:58 GMT
            [2] => Expires: -1
            [3] => Cache-Control: private, max-age=0
            [4] => Content-Type: text/html; charset=ISO-8859-1
            [5] => Set-Cookie: PREF=ID=32f21aea66dcfd5c:FF=0:TM=1402762498:LM=1402762498:S=NVP-y-kW9DktZPAG; expires=Mon, 13-Jun-2016 16:14:58 GMT; path=/; domain=.google.com
            [6] => Set-Cookie: NID=67=mO_Ihg4TgCTizpySHRPnxuTp514Hou5STn2UBdjvkzMn4GPZ4e9GHhqyIbwap8XuB8SuhjpaY9ZkVinO4vVOmnk_esKKTDBreIZ1sTCsz2yusNLKA9ht56gRO4uq3B9I; expires=Sun, 14-Dec-2014 16:14:58 GMT; path=/; domain=.google.com; HttpOnly
            [7] => P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
            [8] => Server: gws
            [9] => X-XSS-Protection: 1; mode=block
            [10] => X-Frame-Options: SAMEORIGIN
            [11] => Alternate-Protocol: 443:quic
        )

    [wrapper_type] => http
    [stream_type] => tcp_socket/ssl
    [mode] => rb
    [unread_bytes] => 0
    [seekable] => 
    [uri] => https://www.google.com/
    [timed_out] => 
    [blocked] => 1
    [eof] => 
)
Giacomo1968
  • 25,759
  • 11
  • 71
  • 103
  • @Wies You’re welcome! I also added another approach which uses `fopen`, `stream_context_create` & `stream_get_meta_data` to retrieve data. Might be worth exploring to see if one method is better or faster than the other. – Giacomo1968 Jun 14 '14 at 16:18
1

Try add curl_setopt( $cl, CURLOPT_CUSTOMREQUEST, 'HEAD' ); to send HEAD request.

CnapoB
  • 665
  • 1
  • 9
  • 16