10

Lately I've been studying (more in practice to tell the truth) regex, and I'm noticing his power. This demand made by me (link), I am aware of 'backreference'. I think I understand how it works, it works in JavaScript, while in PHP not.

For example I have this string:

[b]Text B[/b]
[i]Text I[/i]
[u]Text U[/u]
[s]Text S[/s]

And use the following regex:

\[(b|i|u|s)\]\s*(.*?)\s*\[\/\1\]

This testing it on regex101.com works, the same for JavaScript, but does not work with PHP.

Example of preg_replace (not working):

echo preg_replace(
    "/\[(b|i|u|s)\]\s*(.*?)\s*\[\/\1\]/i", 
    "<$1>$2</$1>",
    "[b]Text[/b]"
);

While this way works:

echo preg_replace(
    "/\[(b|i|u|s)\]\s*(.*?)\s*\[\/(b|i|u|s)\]/i", 
    "<$1>$2</$1>",
    "[b]Text[/b]"
);

I can not understand where I'm wrong, thanks to everyone who helps me.

Community
  • 1
  • 1
mikelplhts
  • 1,181
  • 3
  • 11
  • 32

1 Answers1

15

It is because you use a double quoted string, inside a double quoted string \1 is read as the octal notation of a character (the control character SOH = start of heading), not as an escaped 1.

So two ways:

use single quoted string:

'/\[(b|i|u|s)\]\s*(.*?)\s*\[\/\1\]/i'

or escape the backslash to obtain a literal backslash (for the string, not for the pattern):

"/\[(b|i|u|s)\]\s*(.*?)\s*\[\/\\1\]/i"

As an aside, you can write your pattern like this:

$pattern = '~\[([bius])]\s*(.*?)\s*\[/\1]~i';

// with oniguruma notation
$pattern = '~\[([bius])]\s*(.*?)\s*\[/\g{1}]~i';

// oniguruma too but relative:
// (the second group on the left from the current position)
$pattern = '~\[([bius])]\s*(.*?)\s*\[/\g{-2}]~i'; 
Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
  • Thank you for the answer, I did not think I made a mistake like that. However I tried to use the pattern with ~ (who did not know to tell the truth), but does not work is not it But it does not work for everyone, or am I wrong? [link](https://regex101.com/r/gV7xR5/2) – mikelplhts May 21 '15 at 20:30
  • However, could you explain briefly (if not a disorder), the difference between `/` and `~`? – mikelplhts May 21 '15 at 20:33
  • 1
    @MicheleLapolla: you are free to choose the pattern delimiter you want (http://php.net/manual/en/regexp.reference.delimiters.php), to avoid to escape the literal slashes in the pattern, `~` is a better choice. The three patterns work well, you can check them here: https://eval.in/368317 – Casimir et Hippolyte May 21 '15 at 20:41