18

I wonder if there is any good PHP script (libraries) to check if link are broken? I have links to documents in a mysql table and could possibly just check if the link leads to a the document, or if I am redirected to anther url. Any idea? I would prefer to do it in PHP.

Might be related to: Check link works and if not visually identify it as broken

Community
  • 1
  • 1
StenW
  • 1,964
  • 5
  • 24
  • 27
  • The related topic seems pretty relevant. – Kermit Apr 02 '13 at 17:55
  • 3
    check for response headers using curl and post your code with specific problem – Ejaz Apr 02 '13 at 17:56
  • Is this link for some content on your website or content on another website? – Touch Apr 02 '13 at 17:59
  • It is for content on someone else's websites. It is actually just standard medical forms, that is being supplied by the different municipalities in sweden. – StenW Apr 02 '13 at 18:01
  • THe linked question is relevant, but the person is asking about dynamic links, if I understand it right. My question is related to static documents. I am no expert, but there is a difference right? – StenW Apr 02 '13 at 18:04

5 Answers5

29

You can check for broken link using this function:

function check_url($url) {

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_HEADER, 1);
    curl_setopt($ch , CURLOPT_RETURNTRANSFER, 1);
    $data = curl_exec($ch);
    $headers = curl_getinfo($ch);
    curl_close($ch);

    return $headers['http_code'];
}

You need to have CURL installed for this to work. Now you can check for broken links using:

$check_url_status = check_url($url);
if ($check_url_status == '200')
   echo "Link Works";
else
   echo "Broken Link";

Also check this link for HTTP status codes : HTTP Status Codes

I think you can also check for 301 and 302 status codes.

Also another method would be to use get_headers function . But this works only if your PHP version is greater than 5 :

function check_url($url) {
   $headers = @get_headers( $url);
   $headers = (is_array($headers)) ? implode( "\n ", $headers) : $headers;

   return (bool)preg_match('#^HTTP/.*\s+[(200|301|302)]+\s#i', $headers);
}

In this case just check the output :

if (check_url($url))
   echo "Link Works";
else
   echo "Broken Link";

Hope this helps you :).

Sabari
  • 6,205
  • 1
  • 27
  • 36
  • Is it possible to do this from database? The list of linked documents will have to be updated as new document are added and taken off. I think it would be a little to hard for our staff to manually update the script every time there is a change? BTW thank you for your answer, gets me a place to start. – StenW Apr 02 '13 at 18:38
  • 1
    If you want to update the database, then write some query which fetch the information and then take the links from that and check using php and update them back – Sabari Apr 02 '13 at 18:43
  • Be careful that 301 is not redirecting to a 402. Or the site being a 402, but spitting out a 301. Like right now i know this site is down, but is returning a 301. At one point I had 402 returned. But the code above thinks the site is valid and up, when it is not. – Shawn Rebelo Jan 26 '15 at 21:29
5

You can do this in few ways:

First way - curl

function url_exists($url) {
    $ch = @curl_init($url);
    @curl_setopt($ch, CURLOPT_HEADER, TRUE);
    @curl_setopt($ch, CURLOPT_NOBODY, TRUE);
    @curl_setopt($ch, CURLOPT_FOLLOWLOCATION, FALSE);
    @curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
    $status = array();
    preg_match('/HTTP\/.* ([0-9]+) .*/', @curl_exec($ch) , $status);
    return ($status[1] == 200);
}

Second way - if you dont have curl installed - get headers

function url_exists($url) {
    $h = get_headers($url);
    $status = array();
    preg_match('/HTTP\/.* ([0-9]+) .*/', $h[0] , $status);
    return ($status[1] == 200);
}

Third way - fopen

function url_exists($url){
    $open = @fopen($url,'r');
    if($handle !== false){
       return true;
    }else{
       return false;
    }
}

First & second solutions

Orel Biton
  • 3,478
  • 2
  • 15
  • 15
2

As quick workaround check, you can use the global variable $http_response_header with file_get_contents() function.

For example (extracted from PHP documentation):

<?php
function get_contents() {
  file_get_contents("http://example.com");
  var_dump($http_response_header);
}
get_contents();
var_dump($http_response_header);

Then check the status code in first line for a "HTTP/1.1 200 OK" or other HTTP status codes.

shakaran
  • 10,612
  • 2
  • 29
  • 46
  • That isn't a good idea. Some PHP Installtions are showing warnings if the server isn't found or responsing .... – idmean Apr 02 '13 at 18:06
  • You should not use display_errors or error_reporting in production servers. Also can use the @ silence operator or register_shutdown_function http://php.net/manual/es/function.register-shutdown-function.php for catch errors – shakaran Apr 02 '13 at 18:38
1

Try this:

$url = '[your_url]';
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);

$result = curl_exec($curl);

if ($result === false) {
    echo 'broken url';
} else {
    $newUrl = curl_getinfo($curl, CURLINFO_EFFECTIVE_URL);

    if ($newUrl !== $url) {
        echo 'redirect to: ' . $newUrl;
    }
}
curl_close($curl);
mkjasinski
  • 3,115
  • 2
  • 22
  • 21
0

if you looking for a solution in PHP Laravel. check this link

use Illuminate\Support\Facades\Http;
 
$response = Http::get('http://example.com');
$response->body() : string;
$response->json($key = null) : array|mixed;
$response->object() : object;
$response->collect($key = null) : Illuminate\Support\Collection;
$response->status() : int;
$response->ok() : bool;
$response->successful() : bool;
$response->redirect(): bool;
$response->failed() : bool;
$response->serverError() : bool;
$response->clientError() : bool;
$response->header($header) : string;
$response->headers() : array;
Hassan Fayyaz
  • 685
  • 1
  • 9
  • 15