The Github link is really talking about Mojibake, not "double encoding".
You are using Czech characters, correct? I see two failings in that snippet of output: "black diamonds with question marks" and "ordinary question marks". They are handled separately in Trouble with UTF-8 characters; what I see is not what I stored
But, before trying to solve the problem, figure out what encoding is being used in the client.
Black diamonds (�esk�
)
Case 1 (the client is using latin2
, not utf8`):
- The bytes to be stored are not encoded as utf8. If you can change this, do so. The link assumes that is the goal, not latin2. It will take more research to figure out what to do if the client really needs to be latin2.
- The connection (or
SET NAMES
) for the INSERT
and the SELECT
not set to the client's encoding (latin2
or utf8
or utf8mb4
).
- Also, check that the column in the database is
CHARACTER SET utf8mb4
. (Yes, you could store as latin2, but since you need to fix stuff, let's go with the preferred encoding.)
Case 2 (original bytes were UTF-8):
- The connection (or
SET NAMES
) for the SELECT
was not utf8/utf8mb4. Fix this.
- Check that the column in the database is
CHARACTER SET utf8
(or utf8mb4).
Question Marks (regular ones, not black diamonds) (m?sto
):
- Check the client encoding (as above)
- The column in the database is not CHARACTER SET utf8 (or utf8mb4). Fix this. (Use
SHOW CREATE TABLE
.)
Black diamonds tends to be a browser-only problem, due to the lack of <meta charset=UTF-8>
. Most browsers today default to that, but they can get confused.
See the link for using SELECT col, HEX(col) ...
for debugging what has been stored.
CONVERT(CONVERT(CONVERT(BINARY('éáčďéěíňóřšťúůýž') USING utf8) USING latin1) USING utf8)
--> '��??�?�?�?�?�?��
So, I would guess that you are actually using latin1
, not latin2
. Run
mysql> SHOW VARIABLES LIKE 'char%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | latin2 | <--
| character_set_connection | latin2 | <--
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | latin2 | <--
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
Those 3 need to be set according to the encoding in the client.