0

I am currently using the following regex to validate a website:

^((https?):\/\/)?([w|W]{3}\.)[a-zA-Z0-9\-\.]{3,}\.[a-zA-Z]{2,}(\.[a-zA-Z]{2,})?$

This is currently working for:

http://www.google.com
www.google.com

but not for

google.com
http://google.com

Please help.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
Simone
  • 636
  • 2
  • 8
  • 25
  • try ([w|W]{3}\.|) - note the | is all I added – Jaromanda X Jul 03 '15 at 09:45
  • 1
    Throw it out. That regex is awful. It bans a lot of perfectly valid hostnames. Learn how URLs work and start again. – Quentin Jul 03 '15 at 09:46
  • The pipe `|` is seen as a literal character inside a character class and doesn't mean "OR". A character class is only a set of characters, character ranges, and shorthand character class like `\s` or `\d`. if you want a case-insensitive pattern, use the `i` modifier. – Casimir et Hippolyte Jul 03 '15 at 09:50

1 Answers1

2

You need to make thr group matching www. optional

^((https?):\/\/)?([wW]{3}\.)?[a-zA-Z0-9\-.]{3,}\.[a-zA-Z]{2,}(\.[a-zA-Z]{2,})?$
                            ^

See demo

I am just pointing how to adjust your regex to match the strings you specified. Actually, the topic has been thoroughly covered on SO. For example, please check Trying to Validate URL Using JavaScript post to see how URL can be validated in JavaScript.

Also, a bit of searching over the Web can show some other solutions, like in URL Validation using Regular Expression in Javascript:

^(?:(?:https?|ftp):\/\/)?(?:\S+(?::\S*)?@)?(?:(?!10(?:\.\d{1,3}){3})(?!127(?:\.\d{1,3}){3})(?!169\.254(?:\.\d{1,3}){2})(?!192\.168(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})))(?::\d{2,5})?(?:\/[^\s]*)?$

It is a bit adjusted, see how it works here.

Community
  • 1
  • 1
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • I have used the regex that you have specified 1st i.e. ^((https?):\/\/)?([wW]{3}\.)?[a-zA-Z0-9\-.]{3,}\.[a-zA-Z]{2,}(\.[a-zA-Z]{2,})?$.... it is working for most cases. 1 of the cases that it has nt worked is: www...google.com :( – Simone Jul 03 '15 at 10:30
  • You mean it should not match, right? Then try adding a negative lookahead `(?![^.]*\.\.)`: `^(?![^.]*\.\.)((https?):\/\/)?([wW]{3}\.)?[a-zA-Z0-9\-.]{3,}\.[a-zA-Z]{2,}(\.[a-zA-Z]{2,})?$`. – Wiktor Stribiżew Jul 03 '15 at 10:35
  • yes . this works.. bt again not for wwww.google.com . – Simone Jul 03 '15 at 11:09
  • @Simone: Surely it won't, we specified `w` to repeat only 3 times: `[wW]{3}`. If you plan to allow more, just add `,`: `[wW]{3,}`. – Wiktor Stribiżew Jul 03 '15 at 11:20
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/82286/discussion-between-simone-and-stribizhev). – Simone Jul 03 '15 at 11:22