1

I'm using the following regex to find IMG bbcodes and their contents in forum posts:

~\[img(?:=[\'"]?([^,]*?)(?:,[^]\'"]+)?[\'"]?)?]([^\[]+)?\[/img]~i

This works so far, but i need to define exceptions. I must find all IMG bbcodes, which are NOT surrounded by a TT- or CODE bbcode. I'm not trying to parse BBCodes (because this is done by the forum software itself).

So i want the img bbcode from here (which is working, using the regex above):

Hello, this is an example: [img]xxx[/img] - Yay!

but not from there

[tt]this is a test [img]xxx[/img] yolo![/tt]

and not from here

[code=php]<?php
echo '[img=xxx][/img]';[/code]

Any idea, how to achieve this? I'm using PHP (just in case, that a regex-only-solution is not possible).

SGL
  • 341
  • 2
  • 15
  • possible duplicate of [Best way to parse bbcode](http://stackoverflow.com/q/488963) (unless you add a compelling reason to reinvent the wheel, or show prior attempts to write a recursive regex) – mario Oct 20 '14 at 20:35
  • I don't want to include or rely on a complete lib for this simple (i guess, it's simple) task. I only know one way to achieve this, but it would require a 2nd regex and additional things to do. – SGL Oct 20 '14 at 20:43
  • 1
    If `tt`/`code` is not nested, can [skip](http://perldoc.perl.org/perlre.html#Special-Backtracking-Control-Verbs) the unwanted: `\[(tt|code)\b[^]]*\].*?\[/\1\](*SKIP)(*F)|`... See [example at regex101.com](http://regex101.com/r/iS3gO1/1) – Jonny 5 Oct 21 '14 at 09:49

2 Answers2

1

You could also use T-Regx library

pattern('\[((?:(?!img).)*?)\](?:.*?)\[\/\1\]|\[img.*\](.*?)\[\/img\]')->test($input)
Danon
  • 2,771
  • 27
  • 37
0

you could use this pattern against the second sub-pattern for your match

\[((?:(?!img).)*?)\](?:.*?)\[\/\1\]|\[img.*\](.*?)\[\/img\]  

http://regex101.com/r/tF1tX3/2

  • That's nearly, what i'm looking for. However, it's imperfect as you can see here: http://regex101.com/r/tF1tX3/3 - It shows 3 matches, but the regex is just "allowed" to return the last match. – SGL Oct 20 '14 at 21:53
  • if you check against sub-pattern #2 only it is what you're looking for. – user4160188 Oct 20 '14 at 23:43
  • As you can see on the link, it's not. Furthermore, the functionality of the main regex is broken, as you can see here: http://regex101.com/r/tF1tX3/4 - The original regex matches xyz in the 4th match, while your regex matches xxx (which is wrong). – SGL Oct 21 '14 at 00:22
  • if this is what your looking for [http://regex101.com/r/tF1tX3/5](http://regex101.com/r/tF1tX3/5), I will update my answer, look for sub-patterns 2 & 3 – user4160188 Oct 21 '14 at 03:35