2

I've got a strange issue where a preg_replace has different results on different servers. When the following code is executed on my local wampserver:

echo preg_replace('/[\W]+/u', '-', "blāh bl*h");

The following is output:

"blāh-bl-h"

When executed on my remote server, the following is output:

"bl-h-bl-h"

The "ā" is also replaced.

The php installation on the local wampserver is 5.3.13, on the server it is 5.3.3-7+squeeze15. Is this a PHP version thing?

Mike Mike
  • 1,125
  • 3
  • 13
  • 19
  • Check out this thread http://stackoverflow.com/questions/6407983/utf-8-in-php-regular-expressions – sofl Mar 08 '13 at 16:38
  • @sofl He is already using `\W` and not `[A-Z]`. Mike, [looking here](http://www.php.net/manual/en/pcre.installation.php) not sure it is a different version of PCRE or not. Maybe you check `phpinfo()` and make that determination. Then check if your issue shows up in [change log](http://www.pcre.org/changelog.txt). I see diff. behavior from 5.2 to 5.3 for example: http://codepad.viper-7.com/Fk69OY – ficuscr Mar 08 '13 at 16:43
  • 3
    Yes thats right but I thought he could try something like `/[^\p{L}]+/u` – sofl Mar 08 '13 at 17:12
  • Ok, yes they are using different versions of PCRE - 8.02 vs 8.12. I'm not sure which point on the change log addresses the issue. I guess then i will work around it. I found that both `/[^\p{L}]+/u` and `mb_ereg_replace('/[\W]+/u', '-', "blāh bl*h")` work. My only question is why then does it take so long for debian to update their php stuff.. What are they waiting for? – Mike Mike Mar 09 '13 at 00:29
  • Also make sure that your input string really is UTF-8 encoded text. There's countless ways how utf8 text gets garbled up by various processes (load/save files in a text editor, uploading via FTP, etc.) – Tyron May 11 '16 at 11:06

0 Answers0