0

I wanted to make a small program to help me with my language studies, and I would usually just store a bunch of dictionaries as files by serializing the data.

I don't really know exactly when it is preferable to use a database as compared to setting up a bunch of files.

edit 1:

not sure if i'm understanding this correctly but are you suggesting store each separate translation as a new row in each extra table? I guess that is sufficient to convert array data into relational data, but isn't this a colossal waste of space?

How is this better than just having many Couldn't I just have a dictionary pointing to separate files for each property?

for example, if words have related words and related sentence as properties, then couldn't i use a dictionary of words -> file name containing all related words and a dictionary of words -> file name of file containing all related sentences, and then just serialize those files and load them into the program?

i'm still so hazy on what i'm trading and what i'm gaining by using the database. better organization at the cost of space? Grouping all my dictionaries into one system?

James Joshua Street
  • 3,259
  • 10
  • 42
  • 80
  • Re "each row will have an unknown varying number of items" - this is usually solved by using additional tables, unless all related items are of unknown type. For dictionary I think those items are all words? And you can store some helpful info into intermediate table too; database approach seems OK to me. – Arvo Oct 25 '16 at 06:31
  • I'm hazy on exactly what you mean. If i have a table of words to related words, and every word has a random number of related words, then every time i insert a new word, if there are more related words on this word than the previously added words, will the database just add a column? i want to use a database, but i'm scared i'm going to just find it too limiting. – James Joshua Street Oct 25 '16 at 06:41
  • No, you don't add a column, you add a record into translations table. See answer - too long for comment :) – Arvo Oct 25 '16 at 06:52
  • thank you all for the help. i'm not adverse to reading even for several days. the trick is knowing what to read – James Joshua Street Oct 25 '16 at 06:59
  • [You need to read an introduction to relational databases & other databases.](http://stackoverflow.com/a/28799902/3404097) You need to find out what a "relation" is. So far, you have is_related(a_word, another_word). (Whatever you mean by "related".) [Does it help to see an SQL program to generate parody text?](https://www.simple-talk.com/sql/t-sql-programming/the-parodist-a-sql-server-application/) PS [Any data type can be used for an attribute. Arrays are just typically *bad design*.](http://stackoverflow.com/a/24038895/3404097) – philipxy Oct 25 '16 at 07:06

1 Answers1

1

Consider such structure (no primary keys, indexes and other recommended db stuff here, translation table actually should contain words table identifiers, not words itself and so on):

create table words(
    language char(3),
    word varchar(127),
    ... -- other attributes, like noun/verb etc
)

create table translations(
    srcword varchar(127), -- this points to word to be translated
    dstword varchar(127), -- this points to translation
    ... -- other attributes, like archaic, common etc etc
)

Then you just add new entry into translations table for every pair of words. To get translations for language pair, you can use query like

select src.word, dst.word as translation
from words src
    inner join translations tr on tr.srcword = src.word
    inner join words dst on dst.word = tr.dstword
where
    src.word = 'word_to_be_translated'
    and src.language = 'ENG'
    and dst.language = 'GER'

Probably you need to have table for synonyms and some other tables too.

Arvo
  • 10,349
  • 1
  • 31
  • 34