5

I've got the following code from php.net:

$GLOBALS['normalizeChars'] = array(
'Š'=>'S', 'š'=>'s', 'Ð'=>'Dj','Ž'=>'Z', 'ž'=>'z', 'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'A',
'Å'=>'A', 'Æ'=>'A', 'Ç'=>'C', 'È'=>'E', 'É'=>'E', 'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I',
'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O', 'Õ'=>'O', 'Ö'=>'O', 'Ø'=>'O', 'Ù'=>'U', 'Ú'=>'U',
'Û'=>'U', 'Ü'=>'U', 'Ý'=>'Y', 'Þ'=>'B', 'ß'=>'Ss','à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'a',
'å'=>'a', 'æ'=>'a', 'ç'=>'c', 'è'=>'e', 'é'=>'e', 'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i',
'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o', 'ô'=>'o', 'õ'=>'o', 'ö'=>'o', 'ø'=>'o', 'ù'=>'u',
'ú'=>'u', 'û'=>'u', 'ý'=>'y', 'ý'=>'y', 'þ'=>'b', 'ÿ'=>'y', 'ƒ'=>'f');
$string = strtr($string, $GLOBALS['normalizeChars']);

The string "åäö" should certainly give me "aao", but instead I get "aaa". This is getting really frustrating as I've struggled with it for hours - I mean, it feels like there's not much that can be wrong.

I've tried both setlocale(LC_CTYPE, 'en_US.utf8') and setlocale(LC_CTYPE, 'sv_SE.utf8') and even tried to remove the code and use str_replace('ö', 'o', $string) as well, but nothing is working.

What can possibly be wrong?

Ivar
  • 4,344
  • 6
  • 38
  • 53
  • 1
    The code you've pasted works for me. Try it yourself with a PHP script with only the normalization table and the test string. Locale does not affect `strtr()`. – jmz Sep 05 '10 at 17:18
  • 1
    Not an answer to your question, but have you tried the `Normalizer` class? http://stackoverflow.com/questions/1890854/how-to-replace-special-characters-with-the-ones-theyre-based-on-in-php (not sure whether it can do `Ð` to `Dj` though) – Pekka Sep 05 '10 at 17:18
  • 1
    Your code works fine for me too. – shamittomar Sep 05 '10 at 17:26
  • Works here: http://codepad.org/OzjLqlm5. Does it also fail for you on a different machine? – Tomalak Sep 05 '10 at 17:55
  • 1
    holy hell thats a mess. ever try using UTF-8 escape codes instead? – Talvi Watia Sep 07 '10 at 11:01
  • @Talvi: Indeed it is, but I'm sure that it works. Please explain your method. :) – Ivar Sep 27 '10 at 15:29
  • @ivarska try `\u246=>'o'` instead of `'ö'=>'o'` – Talvi Watia Sep 28 '10 at 07:01
  • @Talvi: What difference would that be, actually? It would still be the same mess, if not worse. Or am I wrong? – Ivar Sep 28 '10 at 18:12
  • @ivarska if you do that, you don't have to worry about the file itself staying in UTF-8 encoding, since you are only using ASCII characters. I try to avoid UTF-8 characters in my code because it is one less thing to worry about in debugging. – Talvi Watia Sep 29 '10 at 21:55

1 Answers1

1

Okay, problem solved - it turned out that the file wasn't saved as UTF-8. Works like a charm now, so this was pretty embarrassing.

Thank you guys!

Ivar
  • 4,344
  • 6
  • 38
  • 53
  • Sorry about the delay. It was saved as ASCII - can't get UltraEdit to save as UTF-8 automaticly, so I have to convert every file I create manually. Pretty annoying, actually. – Ivar Sep 27 '10 at 15:32
  • Really? That would be enough to get me to switch. – dcclassics Mar 24 '14 at 17:41