0

I've got a JavaScript question.

I want to create a regular expression that detects a URL in a given string. I've pasted the regular expression below. It doesn't seem to cover all cases like google.com/index.html?2012 OR www.google.com/dir/file.aspx?isc=2012.

Any ideas on what I need to do to make it work, or perhaps a better regular expression (from somewhere else) that I can use?

("(^|\\s)(((http|https)(:\/\/))?(([a-zA-Z0-9]+[.]{1})+[a-zA-z0-9]+(\/{1}[a-zA-Z0-9\-]+)*\/?))", "i")
Chango
  • 6,754
  • 1
  • 28
  • 37
LewisLin
  • 101
  • 1

1 Answers1

1

I use this regex and it is good for most of the cases. Original version is here http://daringfireball.net/2010/07/improved_regex_for_matching_urls and i had to modify it to avoid matching multiple '.'s in the URL.

/\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:
(?:[^\s().]+[.]?)+|\((?:[^\s()]+|(?:\([^\s()]+\)))*\))+(?:\((?:[^\s()]+|(?:\
([^\s()]+\)))*\)|[^\s`!()\[\]{};:'".,?«»“”‘’]))/gi

If you want the protocol in the beginning to be optional then use this

/\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)?(?:
(?:[^\s().]+[.]?)+|\((?:[^\s()]+|(?:\([^\s()]+\)))*\))+(?:\((?:[^\s()]+|(?:\
([^\s()]+\)))*\)|[^\s`!()\[\]{};:'".,?«»“”‘’]))/gi
Narendra Yadala
  • 9,554
  • 1
  • 28
  • 43
  • 1
    This regex works for the test cases provided as well as basic ones. I checked it using http://regexpal.com/ . – Gibron Sep 26 '11 at 18:00
  • Thanks! This detects a lot of cases that my original regular expression didn't catch. However, it doesn't detect google.com or stanford.edu. – LewisLin Sep 26 '11 at 18:25
  • @LewisLin Yes...the regex needs a valid protocol in the starting. The problem with recognizing google.com is that you will end up being too liberal recognizing patterns such as abc.def etc. – Narendra Yadala Sep 26 '11 at 18:50
  • Thanks for the reply @Narenda. Case closed! – LewisLin Sep 26 '11 at 19:00
  • @LewisLin check out the edited answer which handles optional protocol but it also matches abc.def ...so it is basically a tradeoff that you want to make. – Narendra Yadala Sep 26 '11 at 19:01