0

I am wondering if I would want to design a Chinese input method (as Chinese characters can be typed into the computer by how it is pronounced**)so that when the user can retrieve the word he/she wants. Should I design it as relational database such as using MYSQL or should I consider something else?

Since I cannot find relevant information for my question, I tried looking for how English dictionaries are built for search but the nearest answer I found was the best data structure for dictionary implementation and also another one discussing Where shouldn't I use a relational-database My current thought is that since I only have one huge table of data, I seems like I should try consider other Database management systems? Or are there other suggestions and methods?

Many Thanks!

**More on Chinese English methods if that would help describe my question: In Chinese, to type out a character, it can be completed by pronunciation or the formation of the word (simplify how the word is "composed"), here I would like to focus on the prior where we use pronunciation, a modified examples would be: by typing xi-an-g-3, these four elements would form a word.

Community
  • 1
  • 1
tcheng
  • 1
  • 2
  • What data would you be storing in the database? pinyin, character, definition? You will be searching on the pinyin? – PressingOnAlways Jul 23 '15 at 16:40
  • Using Chuyin, which make it more complicated. I used pingyin here just for explaining the situation. So I would be searching for the consonant and then narrow down some options then the user would input the vowel and the search narrows..... and so on – tcheng Jul 23 '15 at 16:47
  • How would you store the data and how would you be looking it up? Are you always searching by consonant, then narrowing it down? – PressingOnAlways Jul 23 '15 at 16:49
  • What I am trying to ask if how should I store these data, should I use relation-database or something else? Also, the user can start from either vowel or consonant. What's more is that many elements in Chuyin can either be vowel and consonant. I just think that narrowing down as the user searches might make the search more efficient? – tcheng Jul 23 '15 at 16:51
  • If you structure your database correctly, I think this could work out pretty efficiently using RDBMS. You should consider separating your consonants and vowels to and running outer joins to narrow down the search. – PressingOnAlways Jul 23 '15 at 17:06

1 Answers1

0

Yes - you certainly can use a relational database for creating a dictionary. Languages typically have under 100,000 words. With the proper indexes, queries to the database should be very quick.

You have to understand that "big data" these days are millions of records or more and that almost any database system will be able to handle this small set of data.

The question you need to be asking is what will the load be on the server (how many lookups will be happening at once?) and if you need to be optimizing it or adding cache.

I would always advice to start with keeping data in database normalized form and go from there in caching and optimizing your data.

PressingOnAlways
  • 11,948
  • 6
  • 32
  • 59