0

Ok. I get regexp from here (only WWW links, second version). Everything is fine, except one thing, it parse BBCode too.

Regexp

(?i)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))

What fails

[img]http://example.foo/something.png[/img]

When I apply regexp I get http://example.foo/something.png[/img and that's fail. :P Any regexp guru there?

ThomK
  • 637
  • 2
  • 8
  • 21
  • Do you mean that you want to add in the ability for it to accept a URL surrounded by the '[img]' tags? i.e. "http://example.foo/something.png" works but "[img]http://example.foo/something.png[/img"] fails? – Abe Schneider Jul 18 '11 at 15:22
  • Exactly. I'm parsing user input. Sometimes they use BB code. I want to get all links ("naked" (without BB code) and surrounded with BB code). Edit: and that regexp match too much, http://example.foo/something.png[/img instead of http://example.foo/something.png. – ThomK Jul 18 '11 at 15:34
  • Well there is a problem in that `[` and `]` are reserved characters and are permitted in URI's. I would use a BBCode library to parse out BBCode first and then look for URLs with the regex, otherwise if you reject the square brackets, you could potentially reject valid URIs. – Brendan Jul 18 '11 at 15:42
  • Saying this though, it seems that square brackets are only valid in certain places in the URI and it may be possible to re-write the regex aaccordingly (see [this question and its answers](http://stackoverflow.com/questions/1547899/)) ... – Brendan Jul 18 '11 at 15:52
  • I did as you suggested. BB code strip -> match all links, iterate them and change in original value. Works great. :) – ThomK Jul 18 '11 at 17:37

1 Answers1

0

This is a little rough, but try this:

$preg = "%(?:https?://|www\d{0,3}.)(?:[\/A-Za-z0-9-_.]+(?!(?:<|\[/([A-Za-z0-9])+?\1)))%";

I've tested it and it should work as expected if I understood your question correctly.

pb149
  • 2,298
  • 1
  • 22
  • 30