1

I'd like to convert character encoding of a string to HTML-ENTITIES and then back to utf-8. I thought that converting to some encoding and back should leave me with the same string, but it doesn't look like it.

My testing string is:

Test: ěščřžýáíé'

Conversion to HTML-ENTITIES

echo mb_convert_encoding('Test: ěščřžýáíé', 'HTML-ENTITIES', 'UTF-8');

outputs this result:

Test: ěščřžýáíé

However when I try to convert back to utf-8

echo mb_convert_encoding('Test: ěščřžýáíé', 'UTF-8', 'HTML-ENTITIES');

I surprisingly get (incorrect) ouput, not the original string:

Test: ěščřžýáíé

How can I properly convert encodings to get my original string back?

feek
  • 11
  • 4
  • Works for me. You should double-check that the display encoding for your final output is UTF-8 and not something else. eg: your browser is set to auto-detect and is using ISO-8859. http://stackoverflow.com/questions/279170/utf-8-all-the-way-through – Sammitch Jun 23 '16 at 18:34
  • @Sammitch Oh, you're right of course. I forgot to set the display encoding of my testing page properly and Firefox defaulted to something else _facepalm_. Thank you! – feek Jun 23 '16 at 18:49

1 Answers1

0

Use header to modify the HTTP header:

header('Content-Type: text/html; charset=utf-8');

Note to call this function before any output has been sent to the client. Otherwise the header has been sent too and you obviously can’t change it any more. You can check that with headers_sent. See the manual page of header for more information.

<?php
header('Content-Type: text/html; charset=utf-8');
$konv1 = mb_convert_encoding('Test: ěščřžýáíé', 'HTML-ENTITIES', 'UTF-8');
echo $konv1.'<hr>';
$konv2 = mb_convert_encoding($konv1, 'UTF-8', 'HTML-ENTITIES');
echo $konv2.'<hr>';