0

I'm trying to get regexp to match some nested tags. (Yes I know I should use a parser, but my input will be correct).

Example:

Text.
More text.
[quote]
First quote
[quote]
Nested second quote.
[/quote]
[/quote]

Let's say I want the regexp to simply change the tags to <blockquote>:

Text.
More text.
<blockquote>
First quote
<blockquote>
Nested second quote.
</blockquote>
</blockquote>

How would I do this, matching both opening and closing tags at the same time?

Gumbo
  • 643,351
  • 109
  • 780
  • 844
soupagain
  • 1,123
  • 5
  • 16
  • 32

4 Answers4

3

If you don’t mind correctness, then you could use a simple string replacement and replace each tag separately. Here’s some example using PHP’s str_replace to replace the opening and closing tags:

$str = str_replace('[quote]', '<blockquote>', $str);
$str = str_replace('[/quote]', '</blockquote>', $str);

Or with the help of a regular expression (PHP again):

$str = preg_replace('~\[(/?)quote]~', '<$1blockquote>', $str);

Here the matches of \[(/?)quote] are replaced with <$1blockquote> where $1 is replaced with the match of the first group of the pattern ((/?), either / or empty).

But you should really use a parser that keeps track of the opening and closing tags. Otherwise you can have an opening or closing tag that doesn’t have a counterpart or (if you’re using further tags) that is not nested properly.

Gumbo
  • 643,351
  • 109
  • 780
  • 844
  • @KennyTM: Ah, thanks for the remark. I don’t know how I assumed that he wants to use PHP. – Gumbo Mar 03 '10 at 18:50
2

You can't match (arbitrarily) nested stuff with regular expressions.

But you can replace every instance of [quote] with <blockquote> and [/quote] with </blockquote>.

Community
  • 1
  • 1
kennytm
  • 510,854
  • 105
  • 1,084
  • 1,005
  • 2
    Caveat: You can match nested stuff to a predetermined depth: http://blog.stevenlevithan.com/archives/regex-recursion – ghoppe Mar 03 '10 at 18:42
  • "You can't match (arbitrarily) nested stuff with regular expressions." That's the answer I was looking for :) So I used a BBCode parser: http://nbbc.sourceforge.net/ – soupagain Mar 10 '10 at 11:13
1

It's a lousy idea, but you're apparently trying to match something like: \[\(/?\)quote\] and replace it with: <\1blockquote>

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
1

You could use 2 expressions.

s/\[quote\]/\<blockquote\>/
s/\[\/quote\]/\<\/blockquote\>/
Micah
  • 514
  • 3
  • 11