1

Currently, whenever I create a new MySQL database, I use utf8mb4 as a character set and utf8mb4_unicode_520_ci for the collation, e.g.:

CREATE DATABASE IF NOT EXISTS db_name
    DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_520_ci;

Is there a newer/upgraded general purpose collation or character set for MySQL?

For example if there's a collation that superseded utf8mb4_unicode_520_ci, like utf8mb4_unicode_800_ci or something like that?

Thanks for your help.

Shadow
  • 33,525
  • 10
  • 51
  • 64
GTS Joe
  • 3,612
  • 12
  • 52
  • 94

1 Answers1

1

You can find out what collations are supported on your current instance of MySQL. Here's output from my MySQL 5.7 instance:

mysql> select * from information_schema.collations where character_set_name='utf8mb4';
+------------------------+--------------------+-----+------------+-------------+---------+
| COLLATION_NAME         | CHARACTER_SET_NAME | ID  | IS_DEFAULT | IS_COMPILED | SORTLEN |
+------------------------+--------------------+-----+------------+-------------+---------+
| utf8mb4_general_ci     | utf8mb4            |  45 | Yes        | Yes         |       1 |
| utf8mb4_bin            | utf8mb4            |  46 |            | Yes         |       1 |
| utf8mb4_unicode_ci     | utf8mb4            | 224 |            | Yes         |       8 |
| utf8mb4_unicode_520_ci | utf8mb4            | 246 |            | Yes         |       8 |
...

There are also a bunch of national collations.

There are new collations in MySQL 8.0 for the updated UCA 9.0.0 standard:

| utf8mb4_0900_ai_ci         | utf8mb4            | 255 | Yes        | Yes         |       0 | NO PAD        |
| utf8mb4_0900_as_ci         | utf8mb4            | 305 |            | Yes         |       0 | NO PAD        |
| utf8mb4_0900_bin           | utf8mb4            | 309 |            | Yes         |       1 | NO PAD        |

And more national collations.

There's really good docs on the new collations here: https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-sets.html

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • Thanks Bill. This was exactly the answer I was looking for; a way to check in my current installation for the latest COLLATION_NAME available. It turns out utf8mb4_unicode_520_ci is still my best option. – GTS Joe Jun 03 '21 at 17:44