Short answer to the title question is that it is OK to put the 256 characters that are common to both latin1 and utf8 into either CHARACTER SET
for a column. However, you must be clear as to what encoding you are using. Otherwise ®
might display as ®
("Mojibake").
No, that SELECT
fetches the default for any new tables in that database. It does not control how the columns are stored.
The database has a default for new tables.
The table has a default for new columns.
The column has the true definition of the CHARACTER SET
.
So, do SHOW CREATE TABLE
and look at the columns. If a column don't specify a charset, then look at default for the table, which is at the end of the output. (There is also a way to get this info from information_schema.COLUMNS
, but that is clunkier.)
®
is hex AE
in latin1 or C2AE
in utf8 (or utf8mb4). That character does not exist in the "ascii" character set, which stops at 7 bits.
However, since ®
exists in both latin1
and utf8
, you can safely go back and forth between the two encodings. That is, IF you tell MySQL the correct stuff.
The encoding in the client is specified in SET NAMES
or the connection parameters. If the client has AE
, the you must specify latin1; if the client has C2AE
, you must specify utf8.
Meanwhile, the column (not the table, nor the database) can be either latin1 or utf8. The conversion, if needed, will be done as you INSERT
and SELECT
.
Caution: latin1 has only 256 different encodings, no Chinese, no Emoji, virtually nothing except Western European characters.
Going forward, it is best to define most columns utf8mb4
. Otherwise, a pile-of-poo (
) might be displayed ????
.
If you get question marks, Mojibake, etc, consult Trouble with UTF-8 characters; what I see is not what I stored