4
preg_replace("/\[b\](.*)\[\/b\]/Usi", "<strong>$1</strong>", "Some text here... [b][b]Hello, [b]PHP![/b][/b][/b] ... [b]and here[/b]");

returns

Some text here... <strong>[b]Hello, [b]PHP!</strong>[/b][/b] ... <strong>and here</strong>

But i need to replace all [b]...[/b] tags. Why this doesn't happen in my case?

klis
  • 43
  • 5

3 Answers3

4

Yes, a multi-pass approach is required if the elements are nested. This can be accomplished in one of two ways; matching from the inside out or from the outside in. Here are two tested scripts with fully commented regexes which illustrate each technique:

1. Replace from the inside out:

<?php // test.php Rev:20121016_0900
$re = '% # Match innermost [b]...[/b] structure.
    \[b\]              # Literal start tag.
    (                  # $1: Element contents.
      # Use Friedls "Unrolling-the-Loop" technique:
      #   Begin: {normal* (special normal*)*} construct.
      [^[]*            # {normal*} Zero or more non-"[".
      (?:              # Begin {(special normal*)*}.
        \[             # {special} Tag open literal char,
        (?!/?b\])      # but only if NOT [b] or [/b].
        [^[]*          # More {normal*}.
      )*               # Finish {(special normal*)*}.
    )                  # $1: Element contents.
    \[/b\]             # Literal end tag.
    %x';
printf("Replace matching tags from the inside out:\n");
$text = file_get_contents('testdata.txt');
$i=0; // Keep track of iteration number.
printf("i[%d]=%s", $i++, $text);
while(preg_match($re, $text)){
    $text = preg_replace($re, '<strong>$1</strong>', $text);
    printf("i[%d]=%s", $i++, $text);
}
?>

Output:

'''
Replace matching tags from the inside out:
i[0]=Some text here... [b][b]Hello, [b]PHP![/b][/b][/b] ... [b]and here[/b]
i[1]=Some text here... [b][b]Hello, <strong>PHP!</strong>[/b][/b] ... <strong>and here</strong>
i[2]=Some text here... [b]<strong>Hello, <strong>PHP!</strong></strong>[/b] ... <strong>and here</strong>
i[3]=Some text here... <strong><strong>Hello, <strong>PHP!</strong></strong></strong> ... <strong>and here</strong>
'''

2. Replace from the outside in:

<?php // test.php Rev:20121016_0901
$re = '% # Match outermost [b]...[/b] structure.
    \[b\]              # Literal start tag.
    (                  # $1: Element contents.
      (?:              # Zero or more contents alternatives.
        [^[]*          # Either non-[b]...[/b] stuff...
        (?:            # Begin {(special normal*)*}.
          \[           # {special} Tag open literal char,
          (?!/?b\])    # but only if NOT [b] or [/b].
          [^[]*        # More {normal*}.
        )*             # Finish {(special normal*)*}.
      | (?R)           # Or a nested [b]...[/b] structure.
      )*               # Zero or more contents alternatives.
    )                  # $1: Element contents.
    \[/b\]             # Literal end tag.
    %x';
printf("Replace matching tags from the outside in:\n");
$text = file_get_contents('testdata.txt');
$i=0; // Keep track of iteration number.
printf("i[%d]=%s", $i++, $text);
while(preg_match($re, $text)){
    $text = preg_replace($re, '<strong>$1</strong>', $text);
    printf("i[%d]=%s", $i++, $text);
}
?>

Output:

'''
Replace matching tags from the outside in:
i[0]=Some text here... [b][b]Hello, [b]PHP![/b][/b][/b] ... [b]and here[/b]
i[1]=Some text here... <strong>[b]Hello, [b]PHP![/b][/b]</strong> ... <strong>and here</strong>
i[2]=Some text here... <strong><strong>Hello, [b]PHP![/b]</strong></strong> ... <strong>and here</strong>
i[3]=Some text here... <strong><strong>Hello, <strong>PHP!</strong></strong></strong> ... <strong>and here</strong>
'''

Note the (?R) recursive expression used in the second approach.

ridgerunner
  • 33,777
  • 5
  • 57
  • 69
1

The reason it doesn't work: You catch the first [b], then move on to the next [/b], and leave anything in between unchanged. Ie, you change the outer [b] tags, but not the ones nested inside.

Your comment to @meza suggests you want to replace the pseudo tags in pairs, or else leave them untouched. The best way to do this is to use multiple passes, like this

$markup = "Some text here... [b][b]Hello, [b]PHP![/b][/b][/b] ... [b]and here[/b]";
$count = 0;
do {
    $markup = preg_replace("/\[b\](.*?)\[\/b\]/usi", "<strong>$1</strong>", $markup, -1, $count );
} while ( $count > 0 );

print $markup;

I'm not even sure if you can do it in a one-line regex, but even if you could, it would be rather complex and therefore hard to maintain.

hashchange
  • 7,029
  • 1
  • 45
  • 41
0

Why use regex for this particular case? You could get away with a simple string replace every [b] to strong and every [/b] to the /strong.

meza
  • 8,247
  • 1
  • 14
  • 23
  • 4
    It's just example. But even in this case if I'll do as you offer and user forget to place opening or closing tag, then whole markup would be broken. If opening or close tag is absent, then no replacement needed. – klis Oct 16 '12 at 10:18