Lets say I create a table in postgres to store language translations. Lets say I have a table like EVENT that has multiple columns which need translation to other languages. Rather than adding new columns for each language in EVENT I would just add new rows in LANGUAGE with the same language_id.
EVENT:
id | EVENT_NAME (fk to LANGUAGE.short) | EVENT_DESCRIPTION (fk to LANGUAGE.long) |
---|---|---|
0 | 1 | 2 |
LANGUAGE:
language_id | language_type | short (char varying 200) | med (char varying 50) | long (char varying 2000) |
---|---|---|---|---|
1 | english | game of soccer | null | null |
1 | spanish | partido de footbol | null | null |
2 | english | null | null | A really great game of soccer |
2 | spanish | null | null | Un gran partido de footbol |
If I want the language specific version I would create a parameterized statement and pass in the language like this:
select event.id, name.short, desc.long
from event e, language name, language desc
where e.id = 0
and e.event_name = name.language_id
and name.language_type = ?
and e.event_desc = desc.language_id
and desc.language_type = ?
My first thought was to have just a single column for the translated text but I would have to make it big enough to hold any translation. Setting to 2000 when many records will only be 50 characters seemed wrong. Hence I thought maybe to add different columns with different sizes and just use the appropriate size for the data Im storing (event name can be restricted to 50 characters on the front end and desc can be restricted to 2000 characters).
In the language table only one of the 3 columns (short,med,long) will be set per row. This is just my initial thought but trying to understand if this is a bad approach. Does the disk still reserve 2250 characters if I just set the short value? I read a while back that if you do this sort of thing in oracle it has to reserve the space for all columns in the disk block otherwise if you update the record it would have to do it dynamically which could be slow. Is Postgres the same?
It looks like you can specify a character varying type without a precision. Would it be more efficient (space wise) to just define a single column not specify the size or just a single column and specify the size as 2000?