0

I am checking if a url exist, using HttpWebRequest. The problem, it's not working for some url's.
Example:
http://www.gkrs.no/
https://www.politi.no/kripos/statistikk/narkotika/
These url's exist. But show up in the code as not existing. It is working for most of the url's i am checking, its just these two that are causing some issues. Does anyone have any examples of what i still need to check for. Maybe there is something different in their headers.

I have tried both GET and HEAD request methods.
I am still new to programming and might need a simpler explanation. Sorry for bad English, not my first language. Any help would be appreciated.

internal static bool IsValidLenke(string url){

        if (String.IsNullOrEmpty(url))
            return false;
        try
        {
            HttpWebRequest request = WebRequest.Create(url) as HttpWebRequest;
            request.Method = "HEAD";
            HttpWebResponse response = request.GetResponse() as HttpWebResponse;
            response.Close();
            return (response.StatusCode == HttpStatusCode.OK);
        }
        catch
        {
            //Any exception will returns false.
            return false;
        }
}
Daniel A. White
  • 187,200
  • 47
  • 362
  • 445
Kong
  • 315
  • 2
  • 12
  • What do you mean, "not working"? And why are you swallowing any and all exceptions without examining them? – Kirk Woll Nov 19 '15 at 15:57
  • 1
    The first url does not return 404, the second url has an SSL issue that you will never recognise as you ignore exceptions. – Alex K. Nov 19 '15 at 16:01
  • log your exceptions: http://stackoverflow.com/questions/3491213/logging-exception-in-c-sharp – user1666620 Nov 19 '15 at 16:03
  • this isn't a fool proof solution. – Daniel A. White Nov 19 '15 at 16:25
  • Thanks guys, I added a log exception. It seems like these two return 403 forbidden. Is it possible to check if a 403 forbidden, site exist. Or does all 403 sites exist ? Could a 403 site not exist? – Kong Nov 19 '15 at 16:43

1 Answers1

0

http://www.gkrs.no/ blocks you because you don't supply a valid user agent. https://www.politi.no/kripos/statistikk/narkotika/ does not accept "HEAD", and then sends you a wild ride of redirects, so you need a cookie container to avoid being caught in an infinite loop.

Do something like this:

HttpWebRequest request = WebRequest.Create(url) as HttpWebRequest;
request.MaximumAutomaticRedirections = 100;
request.AllowAutoRedirect = true;
request.CookieContainer = new CookieContainer();
request.Method = "GET";                
request.UserAgent = " Mozilla/5.0 (Windows NT 10.0; Win64; x64)";
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
response.Close();
Mikael Nitell
  • 1,069
  • 6
  • 16