3

I want to remove ZERO WIDTH NON-JOINER character from a string but using str_replace wasn't useful.

miken32
  • 42,008
  • 16
  • 111
  • 154
Ehsan
  • 2,273
  • 8
  • 36
  • 70

2 Answers2

7

str_replace should solves this, as long as you're careful with what you're replacing.

// \xE2\x80\x8C is ZERO WIDTH NON-JOINER
$foo = "foo\xE2\x80\x8Cbar";

print($foo . " - " . strlen($foo) . "\n");
$foo = str_replace("\xE2\x80\x8C", "", $foo);
print($foo . " - " . strlen($foo) . "\n");

Outputs as expected:

foo‌bar - 9
foobar - 6
MatsLindh
  • 49,529
  • 4
  • 53
  • 84
  • Do you know what is the title of that kind of codes? (I mean`\xE2\x80\x8C`) what is its title? – Mohammad Kermani Mar 09 '16 at 07:23
  • 1
    @Kermani Usually just [escape sequences](http://php.net/manual/en/language.types.string.php#language.types.string.syntax.double) or escape codes; they can be / are different for each language, but most implement a common subset (such as \x, \n, \r, etc.). – MatsLindh Mar 09 '16 at 10:16
0

str_replace will do what you want, but PHP does not have very good native support for Unicode. The following will do what you ask. json_decode has been used to get the Unicode char, since PHP does not support the \u syntax.

<?php
$unicodeChar = json_decode('"\u200c"');
$string = 'blah'.$unicodeChar.'blah';
echo str_replace($unicodeChar, '', $string);
?>

edit: While my method works, I would suggest you use fiskfisk's solution. It is less hacky than using json_decode.

Jordan Mack
  • 8,223
  • 7
  • 30
  • 29
  • 1
    For anyone coming across this, PHP added support for Unicode literals in PHP 7.0 (2015) using the syntax `"\u{200C}"` – IMSoP Apr 04 '23 at 19:05