52

I need to create a large scale DB Model for a web application that will be multilingual.

One doubt that I've every time I think on how to do it is how I can resolve having multiple translations for a field. A case example.

The table for language levels, that administrators can edit from the backend, can have multiple items like: basic, advance, fluent, mattern... In the near future probably it will be one more type. The admin goes to the backend and add a new level, it will sort it in the right position.. but how I handle all the translations for the final users?

Another problem with internationalization of a database is that probably for user studies can differ from USA to UK to DE... in every country they will have their levels (that probably it will be equivalent to another but finally, different). And what about billing?

How you model this in a big scale?

M4N
  • 94,805
  • 45
  • 217
  • 260
udexter
  • 2,307
  • 10
  • 41
  • 58
  • 6
    On a side note, make sure you create your tables with UTF-8 encoding. – Eric J. May 21 '12 at 22:31
  • What technology are you using? Most of the existing frameworks manage i18n quite good. – sp00m May 23 '12 at 23:01
  • @sp00m: I'm using PHP. There is no problem with the language of the website, the "static" ones. I'm asking for things that admins can add from the backend of the website... when adding they can't add 15 language for only 1 item. Probably talking about language/language_levels on this topic is not right too. Or are you saying that manages i18n good on databases too? Thanks! – udexter May 23 '12 at 23:09

2 Answers2

75

Here is the way I would design the database:

Data model

Visualization by DB Designer Fork

The i18n table only contains a PK, so that any table just has to reference this PK to internationalize a field. The table translation is then in charge of linking this generic ID with the correct list of translations.

locale.id_locale is a VARCHAR(5) to manage both of en and en_US ISO syntaxes.

currency.id_currency is a CHAR(3) to manage the ISO 4217 syntax.

You can find two examples: page and newsletter. Both of these admin-managed entites need to internationalize their fields, respectively title/description and subject/content.

Here is an example query:

select
  t_subject.tx_translation as subject,
  t_content.tx_translation as content

from newsletter n

-- join for subject
inner join translation t_subject
  on t_subject.id_i18n = n.i18n_subject

-- join for content
inner join translation t_content
  on t_content.id_i18n = n.i18n_content

inner join locale l

  -- condition for subject
  on l.id_locale = t_subject.id_locale

  -- condition for content
  and l.id_locale = t_content.id_locale

-- locale condition
where l.id_locale = 'en_GB'

  -- other conditions
  and n.id_newsletter = 1

Note that this is a normalized data model. If you have a huge dataset, maybe you could think about denormalizing it to optimize your queries. You can also play with indexes to improve the queries performance (in some DB, foreign keys are automatically indexed, e.g. MySQL/InnoDB).

sp00m
  • 47,968
  • 31
  • 142
  • 252
  • 1
    OK, it's very easy understandable and useful explanation. The only question I have - wouldn't that be expensive in sense of memory and server resources to use TEXT type for every localized string? – Rango Sep 11 '13 at 06:06
  • 1
    @f_martinez I guess it depends on the data you need to store. But feel free to use the type you need, if it could fit in a varchar for example. – sp00m Sep 11 '13 at 17:19
  • 10
    Don't mix **currency** and **translation**. – gavenkoa Jun 04 '14 at 08:21
  • 1
    holly crap! i love you so much right now! @f_martinez if your db is mostly read operations, with several layers of caching it probably won't be anything to worry about. – sUP Sep 25 '16 at 15:44
  • I seem to be understanding the answer but can someone please provide a query on this example structure? Like fetch the page in Chinese? – Japheth Ongeri - inkalimeva Dec 22 '16 at 09:15
  • 1
    @inkalimeva I updated my answer to provide an example, hope that helps :) – sp00m Jan 15 '17 at 23:50
  • There is a typo in the example query. Should be `where l.id_locale = 'en_GB'`. – swenedo Oct 27 '17 at 12:58
  • how would you create a new translation with this schema? :) – Paul Grei Jan 16 '23 at 16:15
43

Some previous StackOverflow questions on this topic:

Some useful external resources:

The best approach often is, for every existing table, create a new table into which text items are moved; the PK of the new table is the PK of the old table together with the language.

In your case:

  1. The table for language levels, that administrators can edit from the backend, can have multiple items like: basic, advance, fluent, mattern... In the near future probably it will be one more type. The admin goes to the backend and add a new level, it will sort it in the right position.. but how I handle all the translations for the final users?

Your existing table probably looks something like this:

+----+-------+---------+
| id | price | type    |
+----+-------+---------+
|  1 |   299 | basic   |
|  2 |   299 | advance |
|  3 |   399 | fluent  |
|  4 |     0 | mattern |
+----+-------+---------+

It then becomes two tables:

+----+-------+   +----+------+-------------+
| id | price |   | id | lang | type        |
+----+-------+   +----+------+-------------+
|  1 |   299 |   |  1 | en   | basic       |
|  2 |   299 |   |  2 | en   | advance     |
|  3 |   399 |   |  3 | en   | fluent      |
|  4 |     0 |   |  4 | en   | mattern     |
+----+-------+   |  1 | fr   | élémentaire |
                 |  2 | fr   | avance      |
                 |  3 | fr   | couramment  |
                 :    :      :             :
                 +----+------+-------------+
  1. Another problem with internationalitzation of a database is that probably for user studies can differ from USA to UK to DE... in every country they will have their levels (that probably it will be equivalent to another but finally, different). And what about billing?

All localisation can occur through a similar approach. Instead of just moving text fields to the new table, you could move any localisable fields - only those which are common to all locales will remain in the original table.

eggyal
  • 122,705
  • 18
  • 212
  • 237
  • Hi @eggyal. Is this approach still valid and the best? If so, how should i follow when i've a database whose tables can't be changed at the moment? – Arash Dec 26 '21 at 05:14