How to convert ASCII encoding to UTF8 in PHP
5 Answers
ASCII is a subset of UTF-8, so if a document is ASCII then it is already UTF-8.

- 914,110
- 126
- 1,211
- 1,335
-
1Word of caution, if the ASCII is "extended" ascii, then you may encounter issues. https://en.wikipedia.org/wiki/Extended_ASCII – Azeroth2b Mar 29 '17 at 14:26
If you know for sure that your current encoding is pure ASCII, then you don't have to do anything because ASCII is already a valid UTF-8.
But if you still want to convert, just to be sure that its UTF-8, then you can use iconv
$string = iconv('ASCII', 'UTF-8//IGNORE', $string);
The IGNORE will discard any invalid characters just in case some were not valid ASCII.
Use mb_convert_encoding to convert an ASCII to UTF-8. More info here
$string = "chárêctërs";
print(mb_detect_encoding ($string));
$string = mb_convert_encoding($string, "UTF-8");
print(mb_detect_encoding ($string));

- 1,280
- 25
- 34
-
1
-
1This answer is basically wrong. [mb_detect_encoding ()](https://php.net/mb_detect_encoding) is both poorly named and poorly documented. All it does is looping through a system-dependent and typically short list of encodings (in my system, only `ASCII` and `UTF-8`) and returning the first one where your all bytes have something assigned. Detecting text encoding programmatically in a reliable way is as hard as detecting whether a picture has a cat. – Álvaro González May 02 '20 at 12:08
"ASCII is a subset of UTF-8, so..." - so UTF-8 is a set? :)
In other words: any string build with code points
from x00 to x7F has indistinguishable representations (byte sequences) in ASCII and UTF-8. Converting such string is pointless.

- 379
- 2
- 8
-
1Key phrase here is the "code points from x00 to x7F". If your "ASCII" has code points from x10 to xFF, then you need to do more work. – Azeroth2b Mar 29 '17 at 14:29
Use utf8_encode()
Man page can be found here http://php.net/manual/en/function.utf8-encode.php
Also read this article from Joel on Software. It provides an excellent explanation if what Unicode is and how it works. http://www.joelonsoftware.com/articles/Unicode.html

- 10,019
- 9
- 74
- 96

- 949
- 6
- 20
-
11utf8_encode was designed to encode latin-1 into utf-8. Only for latin-1 (which is ISO-8859-1). – Dmitri Feb 13 '11 at 14:50
-
This answer is wrong. As per docs, "Encodes an **ISO-8859-1** string to UTF-8". ISO-8859-1 is not ASCII just like Spanish is not Latin. – Álvaro González May 02 '20 at 12:02