0

How to consider exclamation mark as part of URL in regex

Example: The oreginal url is: bla1 bla2 http://www.peckale.com/#!contact/c11m6 bla3 I need to find the URL: http://www.peckale.com/#!contact/c11m6

with regex.

I am using the experation:

((www\.|(http|https|ftp|news|file)+\:\/\/)?[&#95;.a-zA-Z0-9-]+\.[a-zA-Z0-9\/&#95;:@=.+?,##%&~-]*[^.|\'|\# |!|\(|?|,| |>|<|;|\)]*)

But the result is: cut the url after the #

Best regards Shahar

anubhava
  • 761,203
  • 64
  • 569
  • 643
Shahar Zer
  • 37
  • 1
  • 8
  • http://stackoverflow.com/questions/161738/what-is-the-best-regular-expression-to-check-if-a-string-is-a-valid-url – Igor Sep 02 '14 at 08:56

2 Answers2

0

If you want to match a URL in a full text, use this:

(?:www\.|(?:https?|ftp|news|file):\/\/)[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|]

Be aware that the final character class makes sure that if an URL is part of some text, punctuation such as a comma or full stop after the URL is not interpreted as part of the URL. This symbols like exclamation mark (!) are taken only if they are in the middle of the URL but it won't match it if it is at the end of the URL

See demo...

http://regex101.com/r/uG0mD2/3

Oscar Hermosilla
  • 480
  • 5
  • 21
0

Just remove the ! from the last negated character class [^.|\'|\# |!|\(|?|,| |>|<|;|\)].

((www\.|(http|https|ftp|news|file)+\:\/\/)?[&#95;.a-zA-Z0-9-]+\.[a-zA-Z0-9\/&#95;:@=.+?,##%&~-]*[^.|\'|\#|\(|?|,| |>|<|;|\)]*)

DEMO

And my advice is, you don't need to include the | symbol inside the character class.

So this [^.|\'|\#|\(|?|,| |>|<|;|\)] turns out to be [^.'#\(?, ><;\)]

Avinash Raj
  • 172,303
  • 28
  • 230
  • 274