2
  1. I want to input a string from the user and check whether it is a URL or some text , what condition should I use on the string .
  2. Also if the string is not a URL it is to be googled

I ve made the part where it opens the URL in the browser but I am confused with the condition, I can use

if "http://" in input

But what if the user doesn't wants to specify the protocol.

4 Answers4

1

I would use urlparse.urlparse() and test for the presence of a scheme, like so:

if urlparse.urlparse(user_data).scheme:
    print "That's a URL"
Robᵩ
  • 163,533
  • 20
  • 239
  • 308
  • urlparse.urlparse("www.facebook.com") would return false ... but afaik that meets OP's definition of "url" – Joran Beasley Oct 21 '16 at 16:21
  • 1
    I note that OP actually didn't define "url", and he isn't taking the hint (in the comments) that his question isn't well formed. – Robᵩ Oct 21 '16 at 16:23
  • I concure... but this is one corner of his problem that is well formed (on the last line of question) – Joran Beasley Oct 21 '16 at 16:23
0

You can use the regex mentioned at: Python - How to validate a url in python ? (Malformed or not):

regex = re.compile(
        r'^(?:http|ftp)s?://' # http:// or https://
        r'(?:(?:[A-Z0-9](?:[A-Z0-9-]{0,61}[A-Z0-9])?\.)+(?:[A-Z]{2,6}\.?|[A-Z0-9-]{2,}\.?)|' #domain...
        r'localhost|' #localhost...
        r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})' # ...or ip
        r'(?::\d+)?' # optional port
        r'(?:/?|[/?]\S+)$', re.IGNORECASE)
Community
  • 1
  • 1
Moinuddin Quadri
  • 46,825
  • 13
  • 96
  • 126
  • this requires the user define the scheme, which OP specifies he does not want to do... – Joran Beasley Oct 21 '16 at 16:21
  • 1
    I think with *But what if the user doesn't wants to specify the protocol* he mean it should handle "ftp", "http" and "https" protocol (and not related to defining the schema) – Moinuddin Quadri Oct 21 '16 at 16:38
0
validate = URLValidator()

try:
    validate("http://www.avalidurl.com/")
    print("String is a valid URL")
except ValidationError as exception:
    print("String is not valid URL")

Source

Darwin
  • 1,695
  • 1
  • 19
  • 29
-1

just test it

possible_url = user_input
if not possible_url.lower().startswith("http"): possible_url = "HTTP://%s"%possible_url
if 199 < requests.get(possible_url).status_code < 400: print "OK thats a url!
Joran Beasley
  • 110,522
  • 12
  • 160
  • 179
  • 2
    A url of an non-existing document still is a url in my book. Still the same issue: OP needs to define more clearly what he considers a url or not. – spectras Oct 21 '16 at 16:19
  • thats a reasonable statement I think ... (and by the definition of what a url is) – Joran Beasley Oct 21 '16 at 16:22
  • @spectras user has now clarified that he wants to open the page in a browser or google it if it is not a url ... I think that means it must be an existing doc – Joran Beasley Oct 21 '16 at 16:28
  • 1
    How about the document exists, but the server has an invalid SSL certificate? (in that case, `requests` will raise an exception). Or the server does not answer (in that case, the method will hang until timeout then raise). Or the server answers with 403 (which means the doc does exist, only access is forbidden), or 401 (do you go back and ask the user a password?)… I'll stop there, I voted to close too. – spectras Oct 21 '16 at 16:35
  • i took it to mean exists and is reachable(as in if you enter the right search terms google will show you the link) afaik any endpoint reachable via https should also be reachable via http?(not sure about this though) ... ftp im not sure what to do about (let alone sftp or ssh) – Joran Beasley Oct 21 '16 at 16:57