4

I have a problem converting a string from cp1251 to utf8...

I need to get some names from database and those names are in cp1251(i'm not the one who made that database, so I can't edit it, but I know for sure that these names are cp1251)...

The name in database is this - "Р?нтернет РІ цифрах" I'm converting it to utf8 using iconv function like this:

iconv("UTF-8", "CP1251//IGNORE", $name)

and what I have in the result is this - "�?нтернет в цифрах"(it's Russian), but the first two symbols are not correct... it should be "Интернет в цифрах"...

So the final thing that I have to do is somehow change these two symbols "�?" to russian letter "И"... and I really don't know how to do that... I've tried to use preg_replace, but it doesn't work...or I'm not using it correctly.

And I'm sorry for Russian letters, it is really hard to explain what I need without showing them.

Pigalev Pavel
  • 1,155
  • 1
  • 15
  • 29
  • 1
    To convert from cp1251 to utf8, you should use `iconv("CP1251//IGNORE", "UTF-8", $name)` (see [php manual](http://www.php.net/manual/en/function.iconv.php)). – Alain Tiemblo Nov 22 '12 at 08:26
  • The problem also might be in using the wrong connection collation, thus the connection to the database itself destroys the data. Do you see the string properly in phpMyAdmin ? – Alex Nov 22 '12 at 08:27
  • 1
    Ninsuo, I know that! But it works only this way... And it is VERY strange – Pigalev Pavel Nov 22 '12 at 08:47
  • What database do you use? MySQL? – Joni Nov 24 '12 at 22:57

3 Answers3

3

The first letter comes out incorrect because one of the bytes needed to store the UTF-8 encoding of И (0x98 to be exact) is not used in CP1251. If the database has replaced the 98 byte by a question mark you have to change it back before using iconv:

$name = str_replace("\xD0\x3F", "\xD0\x98", $name);
echo iconv("UTF-8", "CP1251//IGNORE", $name);
Joni
  • 108,737
  • 14
  • 143
  • 193
2

use this:

mb_convert_encoding($model->text, 'cp1252', 'utf8')
0

Try this:

function cp1251_to_utf8($s){
           $c209 = chr(209); $c208 = chr(208); $c129 = chr(129);
           for($i=0; $i<strlen($s); $i++)    {
               $c=ord($s[$i]);
               if ($c>=192 and $c<=239) $t.=$c208.chr($c-48);
               elseif ($c>239) $t.=$c209.chr($c-112);
               elseif ($c==184) $t.=$c209.$c209;
               elseif ($c==168)    $t.=$c208.$c129;
               else $t.=$s[$i];
           }
           return $t;
       }
Ing. Michal Hudak
  • 5,338
  • 11
  • 60
  • 91
  • it returned "Р�?Р�Р�Р�в��Р�В�Р�Р�Р�Р�Р�В�Р�в�� Р�Р� Р�в��Р�С�Р�в��Р�Р�Р�В�Р�в��"... – Pigalev Pavel Nov 22 '12 at 09:14