While answering this question, I became uncertain about something that I didn't manage to find a sufficient answer to.
What are the practical differences between using the binary utf8_bin
and the case insensitive utf8_general_ci
collations?
I can see three:
Both have a different sorting order;
_bin
's sorting order is likely to put any umlauts to the end of the alphabet, because byte values are compared (right?)Only case sensitive searches in
_bin
No
A = Ä
equality in_bin
Are there any other differences or side-effects to be aware of?
Reference:
- 9.1.2. Character Sets and Collations in MySQL
- 9.1.7.6. The _bin and binary Collations in the mySQL manual
- 9.1.7.7. The BINARY Operator
Similar questions that don't address the issue: