60

Possible Duplicate:
Schema for a multilanguage database

Here's an example:

[ products ]
id (INT)
name-en_us (VARCHAR)
name-es_es (VARCHAR)
name-pt_br (VARCHAR)
description-en_us (VARCHAR)
description-es_es (VARCHAR)
description-pt_br (VARCHAR)
price (DECIMAL)

The problem: every new language will need modify the table structure.

Here's another example:

[ products-en_us ]
id (INT)
name (VARCHAR)
description (VARCHAR)
price (DECIMAL)

[ products-es_es ]
id (INT)
name (VARCHAR)
description (VARCHAR)
price (DECIMAL)

The problem: every new language will need the creation of new tables and the "price" field is duplicated in every table.

Here's another example:

[ languages ]
id (INT)
name (VARCHAR)

[ products ]
id (INT)
price (DECIMAL)

[ translation ]
id (INT, PK)
model (VARCHAR) // product
field (VARCHAR) // name
language_id (INT, FK) 
text (VARCHAR)

The problem: hard?

TylerH
  • 20,799
  • 66
  • 75
  • 101
Thiago Belem
  • 7,732
  • 5
  • 43
  • 64
  • 6
    The third method is more or less correct - what's hard about it? – K Prime Feb 09 '10 at 09:45
  • 2
    The problem is, that with every solution you find, you'll always find a case, when you need to modify table - i.e. more languages, different languages, another field... – Adam Kiss Feb 09 '10 at 09:57
  • Since a user will very likely use only one language at a time, I believe separate databases for each language should be considered. This approach will take more storage space, however, it won't come with performance issues and it is relatively easy to setup. – Csaba Apr 08 '20 at 08:47

8 Answers8

47

Similar to method 3:

[languages]
id (int PK)
code (varchar)

[products]
id (int PK)
neutral_fields (mixed)

[products_t]
id (int FK)
language (int FK)
translated_fields (mixed)
PRIMARY KEY: id,language

So for each table, make another table (in my case with "_t" suffix) which holds the translated fields. When you SELECT * FROM products, simply ... LEFT JOIN products_t ON products_t.id = products.id AND products_t.language = CURRENT_LANGUAGE.

Not that hard, and keeps you free from headaches.

DanMan
  • 11,323
  • 4
  • 40
  • 61
Gipsy King
  • 1,549
  • 9
  • 15
29

Your third example is actually the way the problem is usually solved. Hard, but doable.

Remove the reference to product from the translation table and put a reference to translation where you need it (the other way around).

[ products ]
id (INT)
price (DECIMAL)
title_translation_id (INT, FK)

[ translation ]
id (INT, PK)
neutral_text (VARCHAR)
-- other properties that may be useful (date, creator etc.)

[ translation_text ]
translation_id (INT, FK)
language_id (INT, FK) 
text (VARCHAR)

As an alternative (not especially a good one) you can have one single field and keep all translations there merged together (as XML, for example).

<translation>
  <en>Supplier</en>
  <de>Lieferant</de>
  <fr>Fournisseur</fr>
</translation>
  • 12
    What if the product table contains several translated fields ? When retrieving products, you will have to do one additional join per translated field, which will result in severe performance issues. There is as well (IMO) additional complexity for insert/update/delete. The single advantage of this is the lower number of tables. I would go for the method proposed by Gipsy King or Clément : I think it's a good balance between performance, complexity, and maintenance issues. – Maxime Pacary Jan 20 '11 at 10:05
18

In order to reduce the number of JOIN's, you could keep separate the translated and non translated in 2 separate tables :

[ products ]
id (INT)
price (DECIMAL)

[ products_i18n ]
id (INT)
name (VARCHAR)
description (VARCHAR)
lang_code (CHAR(5))
Clément
  • 6,670
  • 1
  • 30
  • 22
  • @Clément - The problem here's is when products table get a new field... I'll need to change the products_i18n table too. :/ – Thiago Belem Feb 09 '10 at 09:53
  • 8
    @TiuTalk - only one of the table will get the new field, if it's a translated field, it goes into `products_i18n`, otherwise it goes in `products`. This way you don't duplicate any information. – Clément Feb 09 '10 at 10:04
  • @Clément: product.id is user as FK in products_i18n.id or you use third join table? – CoR Dec 12 '15 at 18:59
  • @CoR Yes, `products.id` could be a foregin key in the `products_i18n` table. The primary key of the `products_i18n` table would be a composite key composed of both `(product.id, products_i18n.lang_code)`. – War10ck Oct 10 '16 at 15:52
3

At my $DAYJOB we use gettext for I18N. I wrote a plugin to xgettext.pl that extracts all English text from the database tables and add them to the master messages.pot.

It works very well - translators deal with only one file when doing translation - the po file. There's no fiddling with database entries when doing translations.

holygeek
  • 15,653
  • 1
  • 40
  • 50
  • 1
    This may work if you only want to provide translations for your application. F.ex. Menu Entries, Headlines, Helptexts etc. – widdy May 07 '19 at 14:16
2

[languages] id (int PK) code (varchar)

[products]
id (int PK)
name
price
all other fields of product
id_language ( int FK )

I actually use this method, but in my case, it's not in a product point of view, for the various pages in my CMS, this work's quite well.

If you have a lot of products it might be a headache to update a single one in 5 or 6 languages... but it's a question of working the layout.

Tio
  • 556
  • 1
  • 7
  • 26
0

What about fourth solution?

[ products ]
id (INT)
language (VARCHAR 2)
name (VARCHAR)
description (VARCHAR)
price (DECIMAL)
*translation_of (INT FK)*

*Translation_of* is FK of it self. When You add default language *translation_of* is set to Null. But when you add second language *translation_of* takes primary produkt language id.

SELECT * FROM products WHERE id = 1 AND translation_of = 1

In that case we get all translations for product with id is 1.

SELECT * FROM products WHERE id = 1 AND translation_of = 1 AND language = 'pl'

We get only product in Polish translation. Without second table and JOINS.

chf
  • 107
  • 1
  • 3
  • That's an interesting approach. I like the ease of querying but it does break the assumption that ever entry in the product table is one product, so one has to keep that in mind. It also allows you to keep using the proper types (varchar, etc.) for the fields. – Rolf Feb 28 '15 at 22:31
  • 1
    I was thinking of implementing the exact same thing right now, but haven't found this solution anywhere else. I see your post is from 2011. Have you had any problems with this? Do you still think it's a good solution? Thanks. – tabacitu Sep 07 '15 at 16:33
  • @tabacitu Have you tried this? What the whole db schema look like? How is it doing on insert, update, delete...? – CoR Dec 12 '15 at 18:46
  • @CoR I have, but I'll soon switch to something more like this answer: http://stackoverflow.com/questions/2227985/whats-the-best-database-structure-to-keep-multilingual-data/2230172#2230172 – tabacitu Dec 12 '15 at 19:19
  • 1
    1) It allows you to have both translatable and non-translatable fields; – tabacitu Dec 12 '15 at 19:20
  • 1
    2) It does not tie all of your translations to a specific one (the default language); – tabacitu Dec 12 '15 at 19:21
  • 2
    An to answer your question, the inserts, updates, deletes are doing fine and snappy. The problem is all in my head - I know the DB isn't as it should be and that bugs me. – tabacitu Dec 12 '15 at 19:23
  • @tabacitu: Yup, I've seen that answer you linked and I'm inclining to use it. It seems like ok balance between performance/usage. If you have time and will, can you please explain how chf's answer should like as db schema. I can't understand what 'FK of it self means'. It's just my curiosity :) – CoR Dec 12 '15 at 19:36
  • @CoR Sure, it means "foreign key on itself", better known as self-referencing table. It's when in your table you have a column (say parent_id) with a foreign key to a column in the same table (say, id). – tabacitu Dec 13 '15 at 11:53
-1

Have many to many relationship.

You have your data table, languages table and a data_language table.

In the data_language table you have

id, data_id, language_id

I think that might work best for your.

AntonioCS
  • 8,335
  • 18
  • 63
  • 92
  • @AntonioCS - The "data" table isn't the "product" table, right? – Thiago Belem Feb 09 '10 at 09:37
  • @TiuTalk it is. This way the product table doesn't have to know which languages there are, neither does the language table. It's all on the data_language table (or in this case 'product_language table) – AntonioCS Feb 09 '10 at 10:30
-2

We use this concept for our webiste (600k views per day) and (maybe surprisingly) it works. Sure along with caching and query optimalization.

[attribute_names]
id (INT)
name (VARCHAR)

[languages_names]
id (INT)
name (VARCHAR)

[products]
id (INT)
attr_id (INT)
value (MEDIUMTEXT)
lang_id (INT)
Petr Peller
  • 8,581
  • 10
  • 49
  • 66