9

I have this Unicode sequence: \u304a\u306f\u3088\u3046\u3054\u3056\u3044\u307e\u3059. How do I convert it into text?

$unicode = '\u304a\u306f\u3088\u3046\u3054\u3056\u3044\u307e\u3059';

I tried:

echo $utf8-decode(unicode);

and I tried:

echo mb_convert_encoding($unicode , 'US-ASCII', 'UTF-8');

and I tried:

echo htmlentities($unicode , ENT_COMPAT, "UTF-8");

but none of these functions convert the sequence into the corresponding Japanese text.

honk
  • 9,137
  • 11
  • 75
  • 83
learntosucceed
  • 1,109
  • 4
  • 11
  • 16

4 Answers4

12

The issue here is that the string is not unicode. It is an escape sequence used to note down unicode by means of ASCII characters (so 7bit save).

There is a simply trick to use the phps json decoder for this:

<?php
$sequence = '\u304a\u306f\u3088\u3046\u3054\u3056\u3044\u307e\u3059';
print_r(json_decode('["'.$sequence.'"]'));

The output is:

Array
(
    [0] => おはようございます
)

This means you can define a simple convenience function:

<?php
$sequence = '\u304a\u306f\u3088\u3046\u3054\u3056\u3044\u307e\u3059';

function decode($payload) {
  return array_pop(json_decode('["'.$payload.'"]'));
}

echo decode($sequence);

You want to add error handling and escaping of json specific control characters inside the payload. This simply example is just meant to point you into the right direction...

Have fun!

arkascha
  • 41,620
  • 7
  • 58
  • 90
6

Transliterator class from intl extension can handle the convertion with its predefined Hex-Any identifier:

$in = '\u304a\u306f\u3088\u3046\u3054\u3056\u3044\u307e\u3059';
$out = transliterator_create('Hex-Any')->transliterate($in);
var_dump($out); # string(27) "おはようございます"
julp
  • 3,860
  • 1
  • 22
  • 21
  • Thx man, you are life saver. I was converting fonts in Laravel, but laravel give me only unicode font..Your method help me alot.But it need to install php extension. – Pyae Sone Jul 09 '19 at 09:53
3
$unicode = '\u304a\u306f\u3088\u3046\u3054\u3056\u3044\u307e\u3059';
$json = sprintf('"%s"',$unicode); # build json string

$utf8_str = json_decode ( $json, true ); # json decode
echo $utf8_str; # おはようございます

See Json string

enter image description here

PHPJungle
  • 502
  • 3
  • 16
2

PHP 7+

As of PHP 7, you can use the Unicode codepoint escape syntax to do this.

echo "\u{304a}\u{306f}\u{3088}\u{3046}\u{3054}\u{3056}\u{3044}\u{307e}\u{3059}"; outputs おはようございます.

Rabin Lama Dong
  • 2,422
  • 1
  • 27
  • 33