2

What is the best collation in MYSQL that I need to use to support all languages in Varchar datatype?

Thanks,

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • There is no "best" there's only a lowest-common denominator. Sorting and comparison rules vary wildly by language. The order of the alphabet is far from constant. – tadman Nov 20 '19 at 23:23
  • also see here https://stackoverflow.com/questions/30074492/what-is-the-difference-between-utf8mb4-and-utf8-charsets-in-mysql – Ossip Nov 20 '19 at 23:51

2 Answers2

3

If I were starting a project today with MySQL 8.0, I'd choose this as a default:

character set: utf8mb4

collation: utf8mb4_0900_ai_ci

(reportedly this collation does not work for Canadian French)

See also: https://www.percona.com/live/e17/sites/default/files/slides/Collations%20in%20MySQL%208.0.pdf

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
0

As character set utf8mb4 is "safest", as it also supports 4-byte Unicode, where utf8 only goes up to 3 bytes.

The collation utf8mb4_unicode_520_ci includes all Unicode characters and has some "smart" comparison matching.

Kobus Myburgh
  • 1,114
  • 1
  • 17
  • 46
Ossip
  • 1,046
  • 8
  • 20
  • That addresses the question of "best character set", not "best collation". – Rick James Nov 24 '19 at 21:19
  • sorry, that's right. `utf8_general_ci` covers all unicode and always worked for me. also see here https://stackoverflow.com/questions/14329314/what-mysql-collation-is-best-for-accepting-all-unicode-characters – Ossip Nov 24 '19 at 21:22
  • .. but in fact it should be `utf8mb4_unicode_ci` or `utf8mb4_general_ci` – Ossip Nov 24 '19 at 21:30
  • correct - our comments overlapped ;) I've updated the answer – Ossip Nov 24 '19 at 21:31
  • `_unicode_` comes from Unicode version 4.0, many years old. ` _general_` does not take into account multi-character comparisions, such as non-spacing accents. – Rick James Nov 24 '19 at 21:33
  • `_unicode_520_` includes improvements in Unicode 5.20. `_0900_` reflects Unicode 9.0. Some of the differences can be seen in http://mysql.rjweb.org/utf8mb4_collations.html – Rick James Nov 24 '19 at 21:36
  • Your answer applies through MySQL 5.7; Bill's applies to 8.0 (so far). – Rick James Nov 26 '19 at 00:54