1

I have an assignment that I am working on, and I can't get a hold of the professor to get clarity on something. The idea is that we are writing an anagram solver, using a given set of words, that we store in 3 different dictionary classes: Linear, Binary, and Hash.

So we read in the words from a textfile, and for the first 2 dictionary objects(linear and binary), we store the words as an ArrayList...easy enough.

But for the HashDictionary, he want's us to store the words in a HashTable. I'm just not sure what the values are going to be for the HashTable, or why you would do that. The instructions say we store the words in a Hashtable for quick retrieval, but I just don't get what the point of that is. Makes sense to store words in an arraylist, but I'm just not sure of how key/value pairing helps with a dictionary.

Maybe i'm not giving enough details, but I figured maybe someone would have seen something like this and its obvious to them.

Each of our classes has a contains method, that returns a boolean representing whether or not a word passed in is in the dictionary, so the linear does a linear search of the arraylist, the binary does a binary search of the arraylist, and I'm not sure about the hash....

Tim
  • 35,413
  • 11
  • 95
  • 121
user1154644
  • 4,491
  • 16
  • 59
  • 102
  • And you're responsible for implementing three objects? A Linear, Binary and Hash? What is "Binary" - Binary tree? Binary search? Are you storing the words or the anagrams? – Richard JP Le Guen Sep 07 '12 at 00:35
  • Sorry, a LinearDictionary, a BinaryDictionary, and a HashDictionary. They each extend from the abstract class Dictionary, that has an abstract method called contains, which each of the 3 classes has to implement. So each class loads the dictionary by reading in the text file, and storing the words in an arraylist, in the case of the linear and binary dictionaries, but we are told to store the words in a HashTable for the HashDictionary, I just don't get why. – user1154644 Sep 07 '12 at 00:37
  • I have never heard of a "LinearDictionary" nor of a "BinaryDictionary". – Richard JP Le Guen Sep 07 '12 at 00:39
  • @Richard If I had to guess, a Linear Dictionary is basically a list that implements a map style interface, and a BinaryDictionary is basically a treemap. – corsiKa Sep 07 '12 at 00:41
  • The general title implies that this addresses a general problem, but it doesn't. So this question attracts attention and then doesn't deliver. Hence this is not a good question. (Making the title more specific would fix it in my opinion.) – oberlies Oct 30 '12 at 14:17

3 Answers3

7

The difference is speed. Both methods work, but the hash table is fast.

When you use an ArrayList, or any sort of List, to find an element, you must inspect each list item, one by one, until you find the desired word. If the word isn't there, you've looped through the entire list.

When you use a HashTable, you perform some "magic" on the word you are looking up known as calculating the word's hash. Using that hash value, instead of looping through a list of values, you can immediately deduce where to find your word - or, if your word doesn't exist in the hash, that your word isn't there.

I've oversimplified here, but that's the general idea. You can find another question here with a variety of explanations on how a hash table works.

Here is a small code snippet utilizing a HashMap.

// We will map our words to their definitions; word is the key, definition is the value
Map<String, String> dictionary = new HashMap<String, String>();
map.put("hello","A common salutation");
map.put("chicken","A delightful vessel for protein");

// Later ...
map.get("chicken"); // Returns "A delightful vessel for protein"; 

The problem you describe asks that you use a HashMap as the basis for a dictionary that fulfills three requirements:

  • Adding a word to the dictionary
  • Removing a word from the dictionary
  • Checking if a word is in the dictionary

It seems counter-intuitive to use a map, which stores a key and a value, since all you really want to is store just a key (or just a value). However, as I described above, a HashMap makes it extremely quick to find the value associated with a key. Similarly, it makes it extremely quick to see if the HashMap knows about a key at all. We can leverage this quality by storing each of the dictionary words as a key in the HashMap, and associating it with a garbage value (since we don't care about it), such as null.

You can see how to fulfill the three requirements, as follows.

Map<String, Object> map = new HashMap<String, Object>();
// Add a word
map.put('word', null);

// Remove a word
map.remove('word');

// Check for the presence of a word
map.containsKey('word');

I don't want to overload you with information, but the requirements we have here align with a data structure known as a Set. In Java, a commonly used Set is the HashSet, which is almost exactly what you are implementing with this bit of your homework assignment. (In fact, if this weren't a homework assignment explicitly instructing you to use a HashMap, I'd recommend you instead use a HashSet.)

Community
  • 1
  • 1
cheeken
  • 33,663
  • 4
  • 35
  • 42
  • When I heard HashTable, I'm thinking HashMap. So the key for an entry is the actual word, and the value is what, its hash? – user1154644 Sep 07 '12 at 00:41
  • I am speaking in very broad terms in my answer. My description applies to both `Hashtable` and `HashMap`. These classes are written for convenience; they are so convenient, in fact, that you don't even need to think about the fact that they are using hashes at all: you simply tell them "I want to store X and associate it with key Y!" and it complies. The hashing happens behind the scenes. – cheeken Sep 07 '12 at 00:44
  • right, but in this case, I know the key is the actual word, I just don't know what the value is to make the pair. From what I see, I'm guessing its the hash for that word, but how do I come up with the hash? – user1154644 Sep 07 '12 at 00:47
  • Oh, I see - does your dictionary not actually contain definitions? It just is a big list of words, and you only want to know whether or not a word is in the list? – cheeken Sep 07 '12 at 00:50
  • cheeken, I completely understand your explanation of a map, but my question is maybe 2 parts. I have a list of words that I need to store in the Hashtable, and some value for each of those words. So when I call mytable.put("wordA","??"). I'm guessing the ?? is going to be a hash of "wordA"? If that's the case, how do I come up with the hash? Someone suggested adding the word at a certain position in the hashtable, but I don't see any methods that allow you to control the positioning, just put() methods. – user1154644 Sep 07 '12 at 00:52
  • cheeken, exactly! And that's why I'm confused as to why the hashtable is used here. – user1154644 Sep 07 '12 at 00:52
  • At this point, I'm not sure what the 'value' is for the keys(words) – user1154644 Sep 07 '12 at 01:05
  • I understand your confusion better now and have amended my answer. – cheeken Sep 07 '12 at 01:07
  • thanks for the explanation. I was initally thinking the same thing, using the word as the key, and null as the value, but that seemed pointless to me, so I figured I would ask. We are adding timing to the contains methods, so I'm guessing there is going to be a drastic time difference between linear search, binary search, and searching the hashtable ?? – user1154644 Sep 07 '12 at 01:10
  • If you fill your dictionary with enough words (millions), the difference will be obvious. – cheeken Sep 07 '12 at 01:11
2

Arrays are hard to find stuff in. If I gave you array[0] = "cat"; array[1] = "dog"; array[2] = "pikachu";, you'd have to check each element just to know if jigglypuff is a word. If I gave you hash["cat"] = 1; hash["dog"] = 1; hash["pikachu"] = 1;", instant to do this in, you just look it up directly. The value 1 doesn't matter in this particular case although you can put useful information there, such as how many times youv'e looked up a word, or maybe 1 will mean real word and 2 will mean name of a Pokemon, or for a real dictionary it could contain a sentence-long definition. Less relevant.

djechlin
  • 59,258
  • 35
  • 162
  • 290
1

It sounds like you don't really understand hash tables then. Even Wikipedia has a good explanation of this data structure.

Your hash table is just going to be a large array of strings (initially all empty). You compute a hash value using the characters in your word, and then insert the word at that position in the table.

There are issues when the hash value for two words is the same. And there are a few solutions. One is to store a list at each array position and just shove the word onto that list. Another is to step through the table by a known amount until you find a free position. Another is to compute a secondary hash using a different algorithm.

The point of this is that hash lookup is fast. It's very quick to compute a hash value, and then all you have to do is check that the word at that array position exists (and matches the search word). You follow the same rules for hash value collisions (in this case, mismatches) that you used for the insertion.

You want your table size to be a prime number that is larger than the number of elements you intend to store. You also need a hash function that diverges quickly so that your data is more likely to be dispersed widely through your hash table (rather than being clustered heavily in one region).

Hope this is a help and points you in the right direction.

paddy
  • 60,864
  • 6
  • 61
  • 103
  • Ok. so when I add a word to the table, is it going to be something like myhashtable.put("myword", getHash("myword"));?? – user1154644 Sep 07 '12 at 00:45
  • No, the hash table insert should not require knowledge of hash table internals (ie calculation of the hash), just like an ordered list should not ask you to search for the correct location to insert. You should simply tell it to insert the word and internally it will do all the hashy stuff. Same with searching: you just ask if the hash table contains a word, without needing to know anything else. – paddy Sep 07 '12 at 02:07