2

Which data structure would be more preferable for making a dictionary in Java ? What would be better a tree or a hash-table ?

user1753035
  • 57
  • 2
  • 9
  • 2
    And "dictionary" means ..? In any case, consider a Prefix Trie or Heap for immutable (read: file-based) "word list dictionary"; if just looking for a Map, see the answer .. –  Oct 18 '12 at 18:54
  • Take a look at http://stackoverflow.com/questions/40471/differences-between-hashmap-and-hashtable – RNJ Oct 18 '12 at 18:56
  • offline dictionary in java where you enter a word in a text field and the meaning of that word. – user1753035 Oct 18 '12 at 18:56
  • 1
    @user1753035 I would consider using a "database" (it doesn't have to be relational) for this; and let that do the lifting. Then the file is pre-created and consumed through said "database" API. E.g. see BerkelyDB (or equivalent, has benefit of simplicity) or SQLite (has benefit of tools) etc. Of course this would also be a fun exercise in creating a disk-based Heap (or B-tree or Extendible Hash). The advantage of using a "database" or disk-based approach is to avoid having to load everything into memory; with the right disk layout it's just a few trivial seeks to look up the definition. –  Oct 18 '12 at 18:58

4 Answers4

5

A Map. Nothing else. If you want sorting, use a TreeMap. Otherwise: HashMap.

A Dictionary maps a (key-)word to it's description or translation.

Andreas Dolk
  • 113,398
  • 19
  • 180
  • 268
1

I would use something like

Map<String,Integer> dictionary = Collections.synchronizedMap(new TreeMap<String,Integer>());

Instead of Integer as the value of a String key, you can use a Class object which probably can hold a list that contains all the positions of that word inside the document There are methods for easy retrieval of key values from the TreeMap. Following is the way you get an iterator out of TreeMap.

Set<Entry<String,Integer>> set = dictionary.entrySet();

Iterator<Entry<String,Integer>> entryItr = set.iterator();

Entry<String,Integer> entry = null;

while(entryItr.hasnext()){
entry = entryItr.next();
// Do whatever you want.
}
Aayush
  • 1,244
  • 5
  • 19
  • 48
0

I would use a Trie, especially for memory efficiency and also prefix lookups. I have an implementation that implements the map interface under APL on github: https://github.com/Blazebit/blaze-utils/tree/c7b1fa586590d121d9f44c1686cb58de0349eb0b/blaze-common-utils

Check it out and maybe it fits your needs better than a simple map.

Christian Beikov
  • 15,141
  • 2
  • 32
  • 58
0

According to the lecture in Introduction to Algorithms of MIT, I would say it is better to go with hash-tables. Because you can do operations in O(1) instead of O(logn)

https://www.youtube.com/watch?v=0M_kIqhwbFo

A. Mashreghi
  • 1,729
  • 2
  • 20
  • 33