What is the reasoning behind setting latin1_swedish_ci
as the compiled default when other options seem much more reasonable, like latin1_general_ci
or utf8_general_ci
?
Asked
Active
Viewed 9.8k times
140

Peter Mortensen
- 30,738
- 21
- 105
- 131

Alan
- 2,897
- 4
- 23
- 27
-
7Possible duplicate of [Why does MySQL use latin1\_swedish\_ci as the default?](http://stackoverflow.com/questions/3936059/why-does-mysql-use-latin1-swedish-ci-as-the-default) – syrkull Feb 10 '16 at 05:09
-
1Please note that `utf8_general_ci` does not support 4-byte UTF-8 so for true UTF-8 support you would want `utf8mb4_general_ci` or one of the other `mb4` variants. – ColinM Sep 25 '18 at 21:30
2 Answers
137
The bloke who wrote it was co-head of a Swedish company.
Possibly for similar reasons, Microsoft SQL Server's default language us_english.

Giacomo1968
- 25,759
- 11
- 71
- 103

gbn
- 422,506
- 82
- 585
- 676
-
7He is Finnish , but Finnish and Swedish share almost the same special characters ,so they share the same case insensitive collation – kommradHomer Feb 26 '14 at 10:47
-
9Talking about 'good defaults'. Which this, of course, is not. Great to see that after what, 20 years? they changed this into a sane default, like ```utf8_general_ci```. Good job, MySQL ! – Michahell Sep 24 '15 at 10:17
-
5Yes you are right, He named MariDB (Wife name is Maria) and MaxDB (His son name is Max). but why he left his Daughter name..! :) LOL. ! – Ajmal PraveeN Jan 08 '18 at 09:06
-
@AjmalPraveen Monty named his database projects in chronological order after his kids; My, Max and Maria. – VexingParse Feb 08 '22 at 14:54
-
-
@MichaelTrouw the latin1 charset can be a good default as it is far smaller than utf8. So if your field is a username which can only be a-z and 0-9, I see no reason why all sorts of characters (to name a few: emoji, Ethiopic syllables, and many more) should be acceptable at the cost of resources. – undefined Oct 15 '22 at 11:54
-
well, yes, but don't design your DB around the requirements for just a username :) – Michahell Oct 20 '22 at 09:19
105
latin1_swedish_ci
is a single byte character set, unlike utf8_general_ci
.
Compared to latin1_general_ci
it has support for a variety of extra characters used in European languages. So it’s a best choice if you don’t know what language you will be using, if you are constrained to use only single byte character sets.

Giacomo1968
- 25,759
- 11
- 71
- 103

Ariel
- 25,995
- 5
- 59
- 69
-
42I like this answer because it tries to objectively justify the choice of latin swedish. However, the accepted answer seems a more plausible explanation, from a social perspective, for why swedish was chosen in particular. – Alan Jul 21 '11 at 19:30
-
3It's certainly possible that this was the author's reasoning, and just a coincidence that he's Swedish. It seems reasonable that a Swede would want (and know) to support additional European characters. – Matt Jan 28 '14 at 20:11
-
3-1 The accepted answer could be just an opinion but it is 100 times more reasonable than this answer. Also , you can see that "the bloke who wrote it" also named MariaDB after his daugther and maxDB after his son. – kommradHomer Feb 26 '14 at 10:35
-
2"latin1_general_ci it has support for a variety of extra characters used in European languages" - Just to make this clear, utf8_general_ci, unlike utf8_unicode, does have a wide support for European languages specific chars. I don't see an advantage over "latin1_swedish_ci". Or am I wrong? – MEM Jul 01 '15 at 11:52
-
For example, CHAR(2) latin1 uses 2 bytes, CHAR(2) utf8mb4 (which is full utf8) uses 8 bytes. I use latin1 to store 2-digit country codes because there will never be non-european characters – the_nuts Jan 06 '17 at 21:19