0
  • I know there are multiple question for url checks. I am very new to python so trying to understand from multiple posts and searching for new library for help as well. I am trying to work for below point for internal as well as external websites. :

       Status Code
       Status Description
       Response Length
       Time Taken 
       Websites are like ,, www.xyz.com , www.abc.log , www.abc.com/xxx/login.html and more combinations. Below is the
    

    initial code ..

    import socket
    from urllib2 import urlopen, URLError, HTTPError
    
    import urllib
    socket.setdefaulttimeout( 23 )  # timeout in seconds
    #print "---------URL----------", " ---Status Code---"
    url='https://www.google.com'
    
        try :
          response = urlopen( url )
        except HTTPError, e:
            print 'The server couldn\'t fulfill the request. Reason:', str(e.code)
            #Want to get code for that but its not showing
    
        except URLError, e:
            print 'We failed to reach a server. Reason:', str(e.reasonse)
            #Want to get code for that but its not showing
    
    
        else :
    
            code=urllib.urlopen(url).getcode()
            **#here getcode is working
            print url,"-------->", code
            #print 'got response!'
    
  • I want to check if website exists or not first . Then will go for rest of checks as above mentioned. How to organise this to work for all the above points for 500+ urls. Do I need to import from txt file ? Also one more point I have seen that if www.xyx.com is working and www.xyz.com/lmn.html do not exists, it is still showing 200 .

RIshu
  • 227
  • 3
  • 11

1 Answers1

1

I think you can the page presence with this code:

import httplib
from urlparse import urlparse

def chkUrl(url):
    p = urlparse(url)
    conn = httplib.HTTPConnection(p.netloc)
    conn.request('HEAD', p.path)
    resp = conn.getresponse()
    return resp.status < 400

if __name__ == '__main__':
    print chkUrl('http://www.stackoverflow.com') # True
    print chkUrl('http://stackoverflow.com/notarealpage.html') # False
Venkata Vamsy
  • 92
  • 1
  • 6
  • Ok. But how to combine it with my code and also the points I mentioned . . Your code is good to check for, if website exists or not.But I am really looking for more points :). If web is down I want to getcode and status . More like that for all below : Status Code Status Description Response Length Time Taken – RIshu Mar 22 '16 at 05:25
  • Do you want to know the server is serving, I think for this you need cURL and if you get a response then it is. Here with the url of pycurl : http://pycurl.io/ – Venkata Vamsy Mar 22 '16 at 05:40
  • I want to check if URL is up or not with its status code and description. Its response length and time taken . – RIshu Mar 22 '16 at 05:50
  • I want to check and print those things. – RIshu Mar 22 '16 at 05:52
  • Check out http://stackoverflow.com/questions/15968031/python-http-status-code this link might help you out. – Venkata Vamsy Mar 22 '16 at 05:52
  • OK.I'll go through it. Also for all d web I will be trying to import it from some text file. Need to check for response time as well. Let's see if someone help us with better solution. Till then I will read as much I can – RIshu Mar 22 '16 at 06:02
  • Ok.. If you need the reponse time you can use this code : import requests;r = requests.get('http://xxxxxxxx.org/'); r.headers; you can see the time taken with the attribute '"x-runtime" – Venkata Vamsy Mar 22 '16 at 06:08