web2py url validator

Question

In a shorten-er built by web2by i want to validate url's first, if it's not valid goes back to the first page with an error message. this is my code in controller (mvc arch.) but i don't get what's wrong..!!

import urllib

def index():
    return dict()

def random_maker():
    url = request.vars.url
    try:
        urllib.urlopen(url)
        return dict(rand_url = ''.join(random.choice(string.ascii_uppercase +
                    string.digits + string.ascii_lowercase) for x in range(6)),
                    input_url=url)
    except IOError:
        return index()

Is this you're complete code? I don't see any way to enter a URL (i.e., a form). Also, you need to import random and string. It would help if you could describe what you expect, what you're getting, and show any other relevant code (e.g., a view). — Anthony, Aug 15 '12 at 14:55
Wouldn't it be better to just check the url with a regex, take a look at this: http://stackoverflow.com/a/11967815/1248554 — BrtH, Aug 15 '12 at 15:31
No need for the regex, as web2py already includes an IS_URL validator. I assume, though, the OP wants to confirm that the URL points to a live site, not simply that it is a well-formed URL. — Anthony, Aug 15 '12 at 17:23
thanks 4 all, look when i use urllib.urlopen it opens a connection to website, right like httplib.HTTPConnect, but the problem is what happens when a user enters something like http://www.jfvbhsjdfvbs.com — Hojat Taheri, Aug 15 '12 at 17:31
the question is what shold i do with non responsive webpages..httplib..urllib.. or some th. else.. — Hojat Taheri, Aug 15 '12 at 17:33

score 1 · Accepted Answer · edited Jun 20 '20 at 09:12

1

Couldn't you check the http response code using httplib. If it was 200 then the page is valid, if it is anything else (like 404) or an error then it is invalid.

See this question: What’s the best way to get an HTTP response code from a URL?

Update:

Based on your comment it looks like your issue is how you are handling the error. You are only handling IOError issues. In your case you can either handle all errors singularly by switching to:

except:
    return index()

You could also build your own exception handler by overriding http_default_error. See How to catch 404 error in urllib.urlretrieve for more information.

Or you can switch to urllib2 which has specific errors, You can then handle the specific errors that urllib2 throws like this:

from urllib2 import Request, urlopen, URLError
req = Request('http://jfvbhsjdfvbs.com')
try:
    response = urlopen(req)
except URLError, e:
    if hasattr(e, 'reason'):
        print 'We failed to reach a server.'
        print 'Reason: ', e.reason
    elif hasattr(e, 'code'):
        print 'The server couldn\'t fulfill the request.'
        print 'Error code: ', e.code
else:
    print 'URL is good!'

The above code with that will return:

We failed to reach a server.
Reason:  [Errno 61] Connection refused

The specifics of each exception class is contained in the urllib.error api documentation.

I am not exactly sure how to slot this into your code, because I am not sure exactly what you are trying to do, but IOError is not going to handle the exceptions thrown by urllib.

edited Jun 20 '20 at 09:12

Community

1
1

answered Aug 15 '12 at 18:27

BigHandsome

4,843
5
23
30

1

def md5(): url = request.vars.url conn = httplib.HTTPConnection(url) try: conn.request("HEAD", "/") return dict(rand_url = ''.join(random.choice(string.ascii_uppercase + string.digits + string.ascii_lowercase) for x in range(6)), input_url=url) except StandardError: print "invalid URL" . . . . gaierror: (11004, 'getaddrinfo failed') – Hojat Taheri Aug 15 '12 at 19:05
thanks, but the problem is web2py is full compatible with python 2.5, and i heard that python 2.5 only has urllib, is it possible to use urllib2 in python 2.5? – Hojat Taheri Aug 15 '12 at 19:58
1

Yes, urllib2 should be in python 2.5. Here is the [api reference for python 2.5](http://docs.python.org/release/2.5/lib/module-urllib2.html) from the python doc site. – BigHandsome Aug 15 '12 at 20:02
1

def random(): url = request.vars.url req = Request(url) try: response = urlopen(req) except URLError, e: if hasattr(e, 'reason'): print 'We failed to reach a server.' print 'Reason: ', e.reason elif hasattr(e, 'code'): print 'The server couldn\'t fulfill the request.' print 'Error code: ', e.code else: return dict(rand_url = ''.join(random.choice(string.ascii_uppercase + string.digits + string.ascii_lowercase) for x in range(6)), input_url=url) – Hojat Taheri Aug 15 '12 at 20:17
i'm realy sorry to take your time, i'm disapointed.. i just wanted to check that the page that i wanted to shorten it's url is available and valid or not..sometimes writing codes decrease our speed and creative mind. – Hojat Taheri Aug 15 '12 at 20:20
The code that you created based on what I gave you works fine: – BigHandsome Aug 15 '12 at 21:15
The code that you created based on what I gave you works fine. This line is where your issue is: dict(rand_url = ''.join(random.choice(string.ascii_uppercase + string.digits + string.ascii_lowercase) for x in range(6)), input_url=url) I am not sure what you are trying to do here. That code does not work at all. It is also not how you would create a dict. A dict would be dict{'jake' : 4098}. You should not be using the assignment operator inside the creation of your dict. At this point I am tapping out. I helped with the url issue, but this is beyond the scope of the question. – BigHandsome Aug 15 '12 at 21:21

web2py url validator

1 Answers1

Update:

Linked

Related