-1

I am looking for regex for simple URLs as

  • http://www.google.com
  • http://www.yahoo.in
  • http://www.example.eu
  • http://www.example.net
  • etc.

No subdirectories allowed. For example in this cases it must not validate http://www.google.com/, http://www.yahoo.in/mail.

Does anyone know any regex to do this?

Felix Kling
  • 795,719
  • 175
  • 1,089
  • 1,143
Rohit Malish
  • 3,209
  • 12
  • 49
  • 67
  • is http:// required, and should www be assumed if no third level domain is present? – Wug Jul 30 '12 at 15:36
  • Which language are you using? If it has tools to parse URLs, use it and verify that the path, query and fragment identifier are empty. – Felix Kling Jul 30 '12 at 15:38
  • well i have searched all over internet and tried countless regexes but none of them worked. http:// is not required it can be as simple as www.google.com – Rohit Malish Jul 30 '12 at 15:42
  • 1
    Are IP addresses allowed? `128.134.25.5` What about IPv6? – Michael Petrotta Jul 30 '12 at 15:47
  • What about INTs? They're pretty "simple". http://1249763890 is Google. – ghoti Jul 30 '12 at 15:50
  • 2
    @RohitMalish - then [update your question](http://stackoverflow.com/posts/11724663/edit) so that it better reflects your requirements, include code you've tried to use but failed to get working, and give us the larger context so that if you're barking up the wrong tree, someone can point you in the right direction. – ghoti Jul 30 '12 at 16:02
  • Hopefully all this discussion is highlighting the futility of what you're trying to do, Rohit. You can make an attempt to exclude "bad" URLs, but you won't get them all. In your [earlier question](http://stackoverflow.com/questions/11723184/checking-if-string-is-web-adress-or-ip-on-android), you say that you *"need to be able to tell if it is adress or not, because if I pass that string to my other method my program will crash"*. You should learn more about what that other method thinks a valid URL is. Then lean on system-level APIs to parse those URLs for you. – Michael Petrotta Jul 30 '12 at 16:06
  • By the way, `www.google.com` isn't a URL - you've got to have a "scheme" qualifier, like `http://` or `ftp://`. And don't forget about ports: `http://foo.com:8080`. – Michael Petrotta Jul 30 '12 at 16:15
  • Just encountered this link and thought to shared this regex tutorial link - https://www.youtube.com/watch?v=TiqXWDyywog – Krishnraj Rana Jul 06 '20 at 12:43

3 Answers3

1

I'm still a noob, but try this:

^http:\/\/[a-zA-Z0-9_\-]+\.[a-zA-Z0-9_\-]+\.[a-zA-Z0-9_\-]+$
Oday Mansour
  • 214
  • 1
  • 12
  • that works, thanks. Could you give me also example for url without http://? For example if it was only www.google.com. Thanks in advance – Rohit Malish Jul 30 '12 at 15:46
  • There are two characters missing: underscore (`_`) and dash (`-`). – Sufian Latif Jul 30 '12 at 15:49
  • just make the "http://" part optional, you'll get this: `^(http:\/\/)?[a-zA-Z0-9_\-]+\.[a-zA-Z0-9_\-]+\.[a-zA-Z0-9_\-]+$`. Also, don't forget to mark answers that answer your question :) – Oday Mansour Jul 30 '12 at 15:49
  • Thanks :) Is it possible to make one regex that works like the one you gave but also validates IP adresses? – Rohit Malish Jul 30 '12 at 15:58
  • This misses the possibility of user-information within the authority. See the [Wikipedia article on URI scheme](http://en.wikipedia.org/wiki/URI_scheme#Generic_syntax) for details. – ghoti Jul 30 '12 at 16:01
  • That is a bit trickier because you'd have to verify that each part of the IP address does not exceed 255. Also you'd have to have a huge or condition to separate the cases where you have 3 parts in the URL or 4 integer parts. Bottom-line, I'm a noob so I can't help anymore :/ – Oday Mansour Jul 30 '12 at 16:02
  • 1
    Yes it does, and these kinds of URL exist. Look I'll just repeat a comment Felix Kling wrote above: "Which language are you using? If it has tools to parse URLs, use it and verify that the path, query and fragment identifier are empty". For thorough cases, let's not reinvent the wheel. – Oday Mansour Jul 30 '12 at 16:08
1

This one should do:

^(https?:\/\/)?[0-9a-zA-Z]+\.[-_0-9a-zA-Z]+\.[0-9a-zA-Z]+$

This should work for URLs starting with http:// or https:// or without the protocol name.

The regex should also be used as case-insensitive. In that case, it can be shortened a bit:

^(https?:\/\/)?[0-9a-z]+\.[-_0-9a-z]+\.[0-9a-z]+$
Sufian Latif
  • 13,086
  • 3
  • 33
  • 70
  • This misses the possibility of user-information within the authority. See the [Wikipedia article on URI scheme](http://en.wikipedia.org/wiki/URI_scheme#Generic_syntax) for details. – ghoti Jul 30 '12 at 16:00
-1

If you don't care whether it is a valid url, you can use:

\S*www\.\S+

All the examples contain www. followed by a nonspace character, but that is unlikely to occur in a normal word.

Roko Mijic
  • 6,655
  • 4
  • 29
  • 36