Let's say I have two tables:
CREATE TABLE `table_1` (
`field` varchar(20) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
And
CREATE TABLE `table_2` (
`field` varchar(20) COLLATE utf8_unicode_520_ci DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_520_ci;
The tables are the same except for their field collation. The first table uses (the default) utf_general_ci
and the second table uses utf8_unicode_520_c1
.
And I have inserted the value abcdé
into both tables. I would assume (maybe incorrectly) that both of these characters would be stored with the binary value of: 61
62
63
C3A9
. And both tables return that result when doing:
select field, hex(field) from table_1;
# abcé 616263C3A9
select field, hex(field) from table_2;
# abcé 616263C3A9
# Both return the same result
Does this mean the value is stored as that binary value or is there more to it as to how a utf-8
string is stored? Now, when I try doing a UNION
or JOIN
, it doesn't allow me to do so due to a mix of collations as it says:
select field, hex(field) from table_1 union
select field, hex(field) from table_2;
Illegal mix of collations for operation 'UNION'
Why does this occur then, and back to my initial question: what is the meaning of collation at the lower/storage level? I thought it was just used for a 'sort' (i.e., not at the storage level, but at the query/algorithmic level).