0

I have a string I want to convert to divs but it doesn't close the div properly.

The example string i am using is this:

[quote]Quote by: user1 [quote]Quote by: user2 ads[/quote]Test[/quote]Testing 2.

This results in:

<div class="quote" style="margin-left:10px;margin-top:10px;">
  Quote by: user1 
  [quote]Quote by: user2 ads
</div>
  Test[/quote]Testing 2.

But it will not convert the internal quotes properly.

My Javascript function is like this:

function bbcode_parser(str) {
search = new Array(
  /\[b\](.*?)\[\/b\]/g,  
  /\[i\](.*?)\[\/i\]/g,
  /\[quote](.*?)\[\/quote\]/g,
  /\[\*\]\s?(.*?)\n/g);

replace = new Array(
      "<strong>$1</strong>",
      "<em>$1</em>",
      "<div class='quote' style='margin-left:10px;margin-top:10px;'>$1</div>");

for (i = 0; i < search.length; i++) {
    str = str.replace(search[i], replace[i]);
}

return str;
}

I have provided a JSFiddle for you to see it in action: http://jsfiddle.net/gRaFW/2/

Please help :)

Sir
  • 8,135
  • 17
  • 83
  • 146
  • 1
    Humm... nesting with regex. So problematic it should be called nastying. – acdcjunior Jul 01 '13 at 03:35
  • Your regex checks for properly opened and closed `[quote]`s. It can't handle nesting properly. Honestly, your best bet (using regex) is just to replace every `[quote]` with `
    ` and every `[/quote]` with `
    ` (without checking if they are properly opened or closed. You could count them before, if you are worried about that.
    – acdcjunior Jul 01 '13 at 03:37
  • Is there a better option without regex? – Sir Jul 01 '13 at 03:39
  • BBCode is common, someone probably already did it: http://stackoverflow.com/questions/1843320/any-good-javascript-bbcode-parser http://blogs.stonesteps.ca/showpost.aspx?pid=33 – acdcjunior Jul 01 '13 at 03:41
  • Hmm that looks hellish complicated in comparison to my attempt =/ – Sir Jul 01 '13 at 03:46

3 Answers3

2

So... Your method is close, but due to the way JavaScript's replace function works, it will only replace the first open and first close pair. Since there is no instance of open/close after the first close, the replace method stops there. Here's how the search method is thinking:

[quote]Quote by: user1 [quote]Quote by: user2 ads[/quote]Test[/quote]Testing 2.
^ Found open, look for close.....................^ Found! Look for open........

Since there's no open after the close, it stops there and executes the replace:

[quote]Quote by: user1 [quote]Quote by: user2 ads[/quote]

becomes:

<div class='quote'>Quote by: user1 [quote]Quote by: user2 ads</div>

and now the whole string reads:

<div class='quote'>Quote by: user1 [quote]Quote by: user2 ads</div>Test[/quote]Testing 2.

This mis-matched pair of tags is what you are seeing, and it displays funny. But if you execute the same replacement again, the opening of the first replacement will match the closing of the second, and vice versa. Odd, but it does however ensure that the HTML has one open tag for each end tag, even if the input does not, which is a surprising, yet desirable outcome. To continue the example:

<div class='quote'>Quote by: user1 [quote]Quote by: user2 ads</div>Test[/quote]Testing 2.
                                   ^ Found Open, look for close........^ Found! .........

And now replace:

[quote]Quote by: user2 ads</div>Test[/quote]

with

<div class='quote'>Quote by: user2 ads</div>Test</div>

to get the whole string:

<div class='quote'>Quote by: user1 <div class='quote'>Quote by: user2 ads</div>Test</div>Testing 2.

Which is exactly what you wanted, although it is an odd and slightly hacky method, but yet HTML safer than some other methods you could be using.

I wrote a simple change to your jsfiddle that simply repeats the replacement until back-to-back replacements result in the same string: http://jsfiddle.net/gRaFW/6/

Note that this method should work for nested tags, as well as back-to-back tags. If the tags are mis-matched, this will break, and nothing but more complex logic, or a library will help. This will produce 2 tags, one open and one close, but there's no guarantee those tags match each other, eg:

[b][quote]Broke[/b][/quote]

So be careful

John McDonald
  • 1,790
  • 13
  • 20
0

It is not possible to properly parse BBCode using regular expressions alone for the same reason that it is not possible to properly parse HTML using regular expressions. Your BBCode parser function will never work.

Community
  • 1
  • 1
C Snover
  • 17,908
  • 5
  • 29
  • 39
  • Do you have a recommended solution ? – Sir Jul 01 '13 at 03:59
  • It is not a trivial task, so yes, it requires more code than a couple of regular expressions. Parsing BBCode requires state information to be stored and error detection/correction. There is no getting around this. Read the Q&A about why you can’t parse HTML with regular expressions for a better understanding of why. – C Snover Jul 01 '13 at 04:19
0

I made an attempt a couple of years ago, though it fails at complex, nested tags due to the complications with trying to "parse" with regular expressions: https://github.com/kaimallea/bbcode

I also threw together a front-end for it here: http://jsfiddle.net/Kai/nJdXF/ - try pasting in something like [quote="Someone"]Hello, there![/quote] to test.

You can check the logic I used for quotes here: https://github.com/kaimallea/bbcode/blob/master/bbcode.js#L83-L94

With all that said, for the best results you should try looking at a real parser. This works for simple stuff, but fails when nesting gets crazy. (e.g., if you use PHP, there's a BBCode extension for PHP here: http://php.net/manual/en/book.bbcode.php)

Kai
  • 9,038
  • 5
  • 28
  • 28
  • Well i could use PHP i was hoping to do it client side so the server has less to deal with it personally. – Sir Jul 01 '13 at 04:22