0

Consider the following:

$string = "Hello, there! Welcome to <>?@#$!& our site<!END!>";

I'm trying to remove everything except for letters, numbers, spaces, and the "special tag" <!END!>

Using preg_replace, I can write this:

$string = preg_replace("/[^A-Za-z0-9 ]/", "", $string);

To remove everything except letters (both capital and lowercase), numbers, and spaces. Now if I wanted to also ignore the <!END!> tag, in theory I could write this:

$string = preg_replace("/[^A-Za-z0-9 <!END!>]/", "", $string);

However this will not specifically ignore the tag <!END!>, but rather any of the characters it contains. So it'll be preserving every <, >, and ! in $string.

The result:

"Hello there! Welcome to <>! our site<!END!>"

But I'm trying to get:

"Hello there Welcome to  our site<!END!>"

Based on my research, it should be possible to include a specific word to ignore in preg_replace by using the \b tags, however "/[^A-Za-z0-9 \b<!END!>\b]/" gave me the same result as above.

Am I doing something wrong?

Live demo: http://sandbox.onlinephpfunctions.com/code/219dc36ab8aa7dfa16e8e623f5f4ba7f4b4b930d

1 Answers1

1

You could use a (*SKIP)(*F) solution:

<!END!>(*SKIP)(FAIL)|[^A-Za-z0-9 ]

That would match:

  • <!END!>(*SKIP)(FAIL) match <!END!> and then skip that match
  • | or
  • [^A-Za-z0-9 ] Match not using what is specified in the character class

For example:

$string = "Hello, there! Welcome to <>?@#$!& our site<!END!>";
$string = preg_replace("/<!END!>(*SKIP)(FAIL)|[^A-Za-z0-9 ]/", "", $string);
echo $string;

That will result in:

Hello there Welcome to our site<!END!>

Demo

The fourth bird
  • 154,723
  • 16
  • 55
  • 70