Which data structure would be more preferable for making a dictionary in Java ? What would be better a tree or a hash-table ?
-
2And "dictionary" means ..? In any case, consider a Prefix Trie or Heap for immutable (read: file-based) "word list dictionary"; if just looking for a Map, see the answer .. – Oct 18 '12 at 18:54
-
Take a look at http://stackoverflow.com/questions/40471/differences-between-hashmap-and-hashtable – RNJ Oct 18 '12 at 18:56
-
offline dictionary in java where you enter a word in a text field and the meaning of that word. – user1753035 Oct 18 '12 at 18:56
-
1@user1753035 I would consider using a "database" (it doesn't have to be relational) for this; and let that do the lifting. Then the file is pre-created and consumed through said "database" API. E.g. see BerkelyDB (or equivalent, has benefit of simplicity) or SQLite (has benefit of tools) etc. Of course this would also be a fun exercise in creating a disk-based Heap (or B-tree or Extendible Hash). The advantage of using a "database" or disk-based approach is to avoid having to load everything into memory; with the right disk layout it's just a few trivial seeks to look up the definition. – Oct 18 '12 at 18:58
4 Answers
A Map. Nothing else. If you want sorting, use a TreeMap. Otherwise: HashMap.
A Dictionary maps a (key-)word to it's description or translation.

- 113,398
- 19
- 180
- 268
I would use something like
Map<String,Integer> dictionary = Collections.synchronizedMap(new TreeMap<String,Integer>());
Instead of Integer as the value of a String key, you can use a Class object which probably can hold a list that contains all the positions of that word inside the document
There are methods for easy retrieval of key values from the TreeMap
. Following is the way you get an iterator out of TreeMap.
Set<Entry<String,Integer>> set = dictionary.entrySet();
Iterator<Entry<String,Integer>> entryItr = set.iterator();
Entry<String,Integer> entry = null;
while(entryItr.hasnext()){
entry = entryItr.next();
// Do whatever you want.
}

- 1,244
- 5
- 19
- 48
I would use a Trie, especially for memory efficiency and also prefix lookups. I have an implementation that implements the map interface under APL on github: https://github.com/Blazebit/blaze-utils/tree/c7b1fa586590d121d9f44c1686cb58de0349eb0b/blaze-common-utils
Check it out and maybe it fits your needs better than a simple map.

- 15,141
- 2
- 32
- 58
According to the lecture in Introduction to Algorithms of MIT, I would say it is better to go with hash-tables. Because you can do operations in O(1) instead of O(logn)

- 1,729
- 2
- 20
- 33