0

I have a Regular Expression for matching URLs. It should not match the following.

  1. www.yammercom
  2. http://www.yammer
  3. www.yammer..com

Regular Expression

/^((((ht|f)tps?:\/\/)*(www\.)?|((ht|f)tps?:\/\/)(www\.)?)*)[a-z0-9-\.]+\.[a-z]{2,4}\/?([^\s<>\#%"\,\{\}\\|\\\^\[\]`]+)?$/

what mistake I am making in my regular expression??

Try it

Rubular demo

Sam
  • 5,040
  • 12
  • 43
  • 95

3 Answers3

1

If you need a regex to match a url, you can take a look here:

http://mathiasbynens.be/demo/url-regex

Which you'll find insanely long regex that most will only pass half the test. This should tell you why it's not a good idea to try to validate it in 1 line of regex.

TheOnly92
  • 1,723
  • 2
  • 17
  • 25
  • I've stated the changes on the first line, here is the correct one: http://rubular.com/r/frCdtvj3Vp – TheOnly92 Oct 16 '13 at 08:12
  • I cant understand what you would like to say?? In the link u have provided http://rubular.com/r/frCdtvj3Vp accepts www.yammer. It is incorrect as per my requirement – Sam Oct 16 '13 at 08:54
  • ^((((ht|f)tps?:\/\/)*(www\.)?|((ht|f)tps?:\/\/)(www\.)?)*)[a-z0-9\-\.]+\.[a-z]{2,4}\/??$. it seems to work. but 3rd conditon only not satisfied – Sam Oct 16 '13 at 09:06
  • Okay, I just looked at the title and found that the regex was incorrect, I've edited the answer for an expression that matches URL. – TheOnly92 Oct 16 '13 at 09:29
1

After free spacing your RE, it turns out that there are some problems...enter image description here

You do not need match protocol using *, and the last [] does not make much sense to me...(maybe you can update your question.)

My tweak is here. See online demo. http://rubular.com/r/12J7ZRo4Qx

/^(
    (https?|ftp)
    :\/\/                      #protocol
  )?                           #is optional
  (www\.)?                     #optional www
  [-a-z0-9]+                   #place - first so it means literally
  \.
  [a-z]{2,4}                   #trailing hostname
  (                            #match pathname
    \/                         #a slash is required
    [-a-z0-9]+                 #same as hostname
  )*
/x

The x flag stands for free-spacing, as in perl.

Anyway, matching url is quite common. My version is not good nor bullet-proofing. It's just for demonstration. If you need more solid RE, check this http://net.tutsplus.com/tutorials/other/8-regular-expressions-you-should-know/.

The book Mastering Regular Expression is another definitive guide

Herrington Darkholme
  • 5,979
  • 1
  • 27
  • 43
0

I think this will work:

/^((((ht|f)tps?:\/\/)*(www\.)?|((ht|f)tps?:\/\/)(www\.)?)*)[a-z0-9\-\.]+\.[a-z]{2,4}\/?([^\s<>\#%"\,\{\}\\|\\\^\[\]`]+)?$/

You have to scape the -

Manolo
  • 24,020
  • 20
  • 85
  • 130