-1

I have values like

Stra\u00c3\u009fe

and

Aur\u00e9lien

I need them transcoded to say Straße or example 2: Aurélien. How can I archive this using PHP functions? The data origins from php_ldap and an Active Directory source if that helps.

ChrJantz
  • 919
  • 1
  • 11
  • 23
  • 2
    Where did you get the substrings "\u00c3" "\u009f3" – Demodave May 18 '15 at 14:02
  • It's in the value of the street property. The other one is part of the CN and givenname property in the AD – ChrJantz May 18 '15 at 14:06
  • Ok, those are unicode values then try exploding the unicode foreach(explode('\u', $string) as $new_string) – Demodave May 18 '15 at 14:17
  • 1
    And what to do with the result? I need to transform the entire string, not just get the unicode values. – ChrJantz May 18 '15 at 14:22
  • I'm sorry but I don't get what you are trying to say. I need to do the transcoding but don't see how. Could you might provide some sort of snippet or function to do so? – ChrJantz May 18 '15 at 14:38
  • 1
    Where are these values coming from to begin with? From a JSON document by any chance? – deceze May 18 '15 at 14:42
  • $string = "Aur\u00e9lien"; foreach(explode('\u', $string) as $new_string) { echo html_entity_decode('' . trim($new_string)); } but it isn't complete this sort of works for aur value – Demodave May 18 '15 at 14:44

1 Answers1

0

You need to convert it from Unicode to UTF-8 using multi-byte conversion. A simple example would be

#source: http://stackoverflow.com/questions/2934563/how-to-decode-unicode-escape-sequences-like-u00ed-to-proper-utf-8-encoded-char

function replace_unicode_escape_sequence($match) {
    return mb_convert_encoding(pack('H*', $match[1]), 'UTF-8', 'UCS-2BE');

function unicode_decode($str) {
    return preg_replace_callback('/\\\\u([0-9a-f]{4})/i', 'replace_unicode_escape_sequence', $str);
}

$str = unicode_decode('\u00e9');
Paul
  • 8,974
  • 3
  • 28
  • 48