2

I had an text field where user has to place an url. I need to validate the url whether format is valid.I need to write an reg-exp to find below invalid urls.

http://www.google.com//test/index.html //Because of double slash after host name

http:/www.google.com/test/index.html //Missing double slash for protocol

I tried below code which is working for second case but not for first case

function test(url) {
    var exp=/^(https?:\/\/)?((([a-z\d]([a-z\d-]*[a-z\d])*)\.)+[a-z]{2,}|((\d{1,3}\.){3}\d{1,3}))(\:\d+)?(\/[-a-z\d%_.~+]*)*(\?[;&a-z\d%_.~+=-]*)?(\#[-a-z\d_]*)?$/;
    return exp.test(url);
}

var firstCase="http://www.google.com//test/index.html";

alert(test(firstCase));

var secondCase = "http:/www.google.com/test/index.html";

alert(test(secondCase ));

var thirdCase="http://www.google.com/test//index.html";

alert(test(thirdCase));
  • [Its working fine for me](http://jsfiddle.net/Nsisodia91/gqwvLze2/). Can you please elaborate whats not working over here – Narendrasingh Sisodia Sep 02 '15 at 11:09
  • possible duplicate of [What is a good regular expression to match a URL?](http://stackoverflow.com/questions/3809401/what-is-a-good-regular-expression-to-match-a-url) – piotrwest Sep 02 '15 at 11:10
  • 1
    Try this tool: https://regex101.com/ It will allow you to see which sections are matching. – Derek Sep 02 '15 at 11:11
  • Two slashes are usually treated as a single one, also, you should account for other protocols, like mailto, ftp... – Ruan Mendes Sep 02 '15 at 11:12
  • possible duplicate of [What is the best regular expression to check if a string is a valid URL?](http://stackoverflow.com/questions/161738/what-is-the-best-regular-expression-to-check-if-a-string-is-a-valid-url) – Dropout Sep 02 '15 at 11:20
  • `//` is a valid path. See https://en.wikipedia.org/wiki/// – Salman A Sep 02 '15 at 11:36

1 Answers1

0

This regex fixes your issue, the "/" needs a question mark after it, to indicate zero or one of those, it was previously grouped with a * element menaing mulitple were allowed

function test(url) {
    var exp=/^(https?:\/\/)?((([a-z\d]([a-z\d-]*[a-z\d])*)\.)+[a-z]{2,}|((\d{1,3}\.){3}\d{1,3}))(\:\d+)?(\/?)([-a-z\d%_.~+]*)*(\?[;&a-z\d%_.~+=-]*)?(\#[-a-z\d_]*)?$/;
    return exp.test(url);
}

alert(test("http://www.google.com//bla")); //false
alert(test("http:/www.google.com/test/index.html")); //false
alert(test("http://www.google.com/bla")); //true

To be more specific: the slash in the first group here (\/?)([-a-z\d%_.~+]*) was prevoiusly inside the second group, and was therefore allowed multiple times

BobbyTables
  • 4,481
  • 1
  • 31
  • 39