Using mb_convert_encoding() to convert string from HTML-ENTITES to UTF-8 and back

Question

I'd like to convert character encoding of a string to HTML-ENTITIES and then back to utf-8. I thought that converting to some encoding and back should leave me with the same string, but it doesn't look like it.

My testing string is:

Test: ěščřžýáíé'

Conversion to HTML-ENTITIES

echo mb_convert_encoding('Test: ěščřžýáíé', 'HTML-ENTITIES', 'UTF-8');

outputs this result:

Test: ě&scaron;čřžýáíé

However when I try to convert back to utf-8

echo mb_convert_encoding('Test: &#283;&scaron;&#269;&#345;&#382;&yacute;&aacute;&iacute;&eacute;', 'UTF-8', 'HTML-ENTITIES');

I surprisingly get (incorrect) ouput, not the original string:

Test: Ä›ĹˇÄŤĹ™ĹľĂ˝ĂˇĂĂ©

How can I properly convert encodings to get my original string back?

Works for me. You should double-check that the display encoding for your final output is UTF-8 and not something else. eg: your browser is set to auto-detect and is using ISO-8859. http://stackoverflow.com/questions/279170/utf-8-all-the-way-through — Sammitch, Jun 23 '16 at 18:34
@Sammitch Oh, you're right of course. I forgot to set the display encoding of my testing page properly and Firefox defaulted to something else _facepalm_. Thank you! — feek, Jun 23 '16 at 18:49

score 0 · Answer 1 · answered Oct 31 '22 at 18:14

Use header to modify the HTTP header:

header('Content-Type: text/html; charset=utf-8');

Note to call this function before any output has been sent to the client. Otherwise the header has been sent too and you obviously can’t change it any more. You can check that with headers_sent. See the manual page of header for more information.

<?php
header('Content-Type: text/html; charset=utf-8');
$konv1 = mb_convert_encoding('Test: ěščřžýáíé', 'HTML-ENTITIES', 'UTF-8');
echo $konv1.'<hr>';
$konv2 = mb_convert_encoding($konv1, 'UTF-8', 'HTML-ENTITIES');
echo $konv2.'<hr>';

Using mb_convert_encoding() to convert string from HTML-ENTITES to UTF-8 and back

1 Answers1