0

$str = str_replace ('–', '-', $str); does not work (the longer Unicode dash character is not replaced with minus character, as I want.

$str is read from a database and should be in UTF-8.

PHP code is run from an Apache server.

I need to replace all these long dashes with minus char.


$dash = "–";
echo "string: " . bin2hex($str) . ", dash: " . bin2hex($dash) . "\n";
echo "string: " . $str . ", dash: " . $dash . "\n";

string: 5a656c626f726166202623383231313b20d0bdd0bed0b2d18bd0b920d0bfd180d0b5d0bfd0b0d180d0b0d18220d0b4d0bbd18f20d0bbd0b5d187d0b5d0bdd0b8d18f20d0bcd0b5d0bbd0b0d0bdd0bed0bcd18b, dash: e28093
string: Zelboraf – новый препарат для лечения меланомы, dash: –

What is wrong (not in proper UTF-8): string or dash?

lorem monkey
  • 3,942
  • 3
  • 35
  • 49
porton
  • 5,214
  • 11
  • 47
  • 95
  • 2
    `$str is read from a database and should be in UTF-8` should be or is it? The string and the file containing the script should all be UTF8. Then your snippet should work. – lorem monkey Sep 27 '12 at 12:13
  • $str is in UTF-8, the script is in UTF-8. But it does not work :-( – porton Sep 27 '12 at 12:16
  • 1
    Please add [a hex-dump](http://stackoverflow.com/questions/1057572/how-can-i-get-a-hex-dump-of-a-string-in-php) of `$str` and `'–'` to your question. The one can validate that you get UTF-8 from the DB and the other that you get it from your file. – hakre Sep 27 '12 at 12:17
  • I added hex dumps. I see there are no "dash" substring in the hex dump of the string, but I don't understand what is wrong: $str or $dash. – porton Sep 27 '12 at 12:40

3 Answers3

3

It was a HTML entity encoded "–" :-) That's is my failure.

porton
  • 5,214
  • 11
  • 47
  • 95
1
<?php

$str = 'Test–asd';

$old = '–';
$new = '!';

$str = str_replace ( $old, $new, $str );

echo $str;

?>

This works just fine for me:

Output:

Test!asd

Seems that you have problems with different encoding, not UTF8 character changes.

Peon
  • 7,902
  • 7
  • 59
  • 100
0

EDIT:

try this:

$str = str_replace('\xe2\x80\x94', '-', $str);

Try:

$str = str_replace(chr(150), '-', $str);    // endash

or

$str = str_replace(chr(151), '-', $str);   // emdash

I think that the second one suits you more.

Develoger
  • 3,950
  • 2
  • 24
  • 38