4

I have the following string I need to parse:

[QUOTE=Mark]
  [QUOTE=Jack]
    How are you doing Mark?
  [/QUOTE]
 Good to hear from you Jack, Im doing fine!
[/QUOTE]

I am basicly trying to convern this set of BBCode into HTML by converting the [quote] areas into stylized DIVs using the following REGEX

text = text.replace(/\[QUOTE=(.*?)]([\s\S]*?)\[\/QUOTE\]/gi, '<div class="quotes"><i>Quote by $1</i><br />$2</div>');

This code will properly parse out the first set of QUOTES, but not the nested level quotes. Any ides how I could impprove the expression?

genesis
  • 50,477
  • 20
  • 96
  • 125
Mark
  • 823
  • 2
  • 10
  • 11
  • Can I ask why you're doing it with javascript? – yoda Sep 11 '11 at 00:02
  • You'd probably need a recursive approach for this. How about loading your quotes as objects (Quote $author $text $subquotes ...)? You could then output to any format you wish. – James P. Sep 11 '11 at 00:03

2 Answers2

4

If that's all you're doing, the solution is much simpler:

text = text.replace(/\[QUOTE=(.*?)\]/gi,
                    '<div class="quotes"><i>Quote by $1</i><br />');
text = text.replace(/\[\/QUOTE\]/gi, '</div>');

Your code works too, but you have to apply it multiple times--two in this case, but if there are triply-nested quotes you'll have to make three passes, and so on.

Alan Moore
  • 73,866
  • 12
  • 100
  • 156
3

When you get into nested levels you lose the "regular" nature of the input. It becomes more "context free" like HTML which is always a hard spot for regexes.

I suggest you tokenize the string and parse it with somethink like a recursive descent parser.

Community
  • 1
  • 1
Andrew White
  • 52,720
  • 19
  • 113
  • 137
  • Wouldn't `Atomic Grouping` in `PCRE -PHP` solve the subject (as well?) ? Just curious :) – yoda Sep 11 '11 at 00:08
  • @yoda: Not atomic grouping, but it is trivial to parse nested BB in PHP using regex. Thing is, he is using JS (which has far less powerful regexes as far as I remember). – NikiC Sep 11 '11 at 13:08