14

I use iconv php function but some characters doesn't convert correctly:

...
$s = iconv('UTF-16', 'UTF-8', $s);
...
$s = iconv('UTF-16//IGNORE', 'UTF-8', $s);
...
$s = iconv('UTF-16LE', 'UTF-8', $s);
...
$s = iconv('UTF-16LE//IGNORE', 'UTF-8', $s);
...

I also try mb_convert_encoding function but can't solve my problem.

A sample text file: 9px.ir/utf8-16LE.rar

Nagama Inamdar
  • 2,851
  • 22
  • 39
  • 48
علیرضا
  • 2,434
  • 1
  • 27
  • 33

2 Answers2

26

iconv supports the UTF-16LE encoding.

You can use it to transpose the encoding from UTF-16LE to UTF-8:

$result = iconv($in_charset = 'UTF-16LE' , $out_charset = 'UTF-8' , $str);
if (false === $result)
{
    throw new Exception('Input string could not be converted.');
}

See iconvDocs.

I'm just wondering if all code-points available in UTF-16LE are available in UTF-8. But I assume that this should fit in your case.


Edit: I was not able to reproduce the problem on a box of my own, but on another box I ran into this notice:

Notice: iconv() [function.iconv]: Wrong charset, conversion from UTF-16LE' toUTF-8' is not allowed in ...

Looks like that not all iconv versions can actually convert UTF-16LE to UTF-8.

It might be a workaround to use mb_convert_encodingDocs instead, at least it was in this case (Demo):

$result = mb_convert_encoding($str , 'UTF-8' , 'UTF-16LE');
hakre
  • 193,403
  • 52
  • 435
  • 836
  • @ali mzm: Can't reproduce, that file works for me with the code example above. – hakre Aug 08 '11 at 10:23
  • @ali mzm: Looks like not all iconv versions support that, maybe you're getting this error? http://codepad.viper-7.com/GQ1TMz – hakre Aug 08 '11 at 10:29
  • @ali mzm: I added a `mb_convert_encoding` example which does the same as `iconv`. – hakre Aug 08 '11 at 10:37
  • no i didnt get the error. i get an string that some characters converted correctly and some not. – علیرضا Aug 08 '11 at 10:37
  • @ali mzm: Are you using `//TRANSLIT` or `//IGNORE`? If so, please don't do unless you've found the cause. If I try either with `iconv` or with `mb_convert_encoding` and *your* data, I can not reproduce the problem. See this demo, which has *your* data - do you see your problem as well in the output? http://codepad.viper-7.com/W4ry1v – hakre Aug 08 '11 at 10:43
0

You should use //TRANSLIT or //IGNORE in the second argument of the function, which represents the output charset. You used it by mistake in the first argument.

Please, see more details and examples at https://www.php.net/manual/en/function.iconv.php

MMJ
  • 555
  • 4
  • 6