1

I have a function that I use to test is the URL is valid before I store it in my db.

function url_exists($url)
{
    ini_set("default_socket_timeout","5");
    set_time_limit(5);
    $f = fopen($url, "r");
    $r = fread($f, 1000);
    fclose($f);

    return strlen($r) > 1;
}

if( !url_exists($test['urlRedirect']) ) { ... }

It works great, however one of my users reported an issue today and when I tested, indeed the following URL was flagged as invalid:

http://www.artleaguehouston.org/charge-grant-survey

So I tried to remove the page name and use only the domain and still got the error. What is it about this domain that my script chokes on?

santa
  • 12,234
  • 49
  • 155
  • 255
  • possible duplicate of [How can I check if a URL exists via PHP?](http://stackoverflow.com/questions/2280394/how-can-i-check-if-a-url-exists-via-php) I think there are better solutions maybe you want to check those out – Rizier123 Jan 06 '15 at 18:58
  • 2
    Many servers are set up to disallow attempts to access from scripts, such as PHP scripts. That URL returns 403 Forbidden when attempting to access from PHP. You need to use something a little more sophisticated, like curl. – kainaw Jan 06 '15 at 19:00

1 Answers1

3

You try to eat soup with a swiss knife there!

PHP supports URL wrappers in file_exists:

if (file_exists("http://www.artleaguehouston.org/charge-grant-survey")) {
    // URL returns a good status code for your IP and User Agent "PHP/x.x.x"
}

CURL:

$ch = curl_init('http://www.artleaguehouston.org/charge-grant-survey');
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_USERAGENT,
    'Mozilla/5.0 (Windows NT 6.2; WOW64; rv:17.0) Gecko/20100101 Firefox/17.0'
);
curl_exec($ch);
$statusCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);

if ($statusCode == 200) {
    // Site up and good status code
}

(Mostly taken from How can one check to see if a remote file exists using PHP? , just to give correct credit).

Community
  • 1
  • 1
Daniel W.
  • 31,164
  • 13
  • 93
  • 151
  • 1
    Lemme give it a whirl. Just one thing, where did $curl come from? Shouldn't it be $ch? – santa Jan 06 '15 at 19:08
  • @santa you should have a look at the different status codes aswell, some sites do redirect, then the code is `302` or `301`, but it doesn't mean the URL is not reachable. Just google `http status codes`. Basically, the site is up when any status code is given, because it would result in an error if the host is down. Also look at CURLs options here: http://php.net/manual/en/function.curl-setopt.php – Daniel W. Jan 06 '15 at 19:25