Anagram Algorithm using a hashtable and/or tries

Question

I have been searching the internet for awhile now for steps to find all the anagrams of a string (word) (i.e. Team produces the word tame) using a hashtable and a trie. All I have found here on SO is to verify 2 words are anagrams. I would like to take it a step further and find an algorithm in english so that I can program it in Java.

For example,

Loop through all the characters.
For each unique character insert into the hashtable.
and so forth.

I don't want a complete program. Yes, I am practicing for an interview. If this question comes up then I will know it and know how to explain it not just memorize it.

[String permutations](http://stackoverflow.com/questions/4240080/generating-all-permutations-of-a-given-string) — atomman, Oct 25 '13 at 22:39
@user2864740 sorry. I wasn't clear. I just edited my example. The word Team produces a word called Tame. — Allen, Oct 25 '13 at 22:43
@atomman - that was a very good link. However, recursion has such a high overhead according to some interviewers that some are looking for an implementation using Hash tables or Tries. — Allen, Oct 25 '13 at 22:53
@JigarJoshi, it took us *forever* to get rid of `interview-questions`, please do not reintroduce it. — Charles, Oct 25 '13 at 23:43

necromancer · Answer 1 · 2013-10-25T23:31:05.560

the most succinct answer due to some guy quoted in the "programming pearls" book is (paraphrasing):

"sort it this way (waves hand horizontally left to right), and then that way (waves hand vertically top to bottom)"

this means, starting from a one-column table (word), create a two column table: (sorted_word, word), then sort it on the first column.

now to find anagrams of a word, first compute sorted word and do a binary search for its first occurrence in the first column of the table, and read off the second column values while the first column is the same.

input (does not need to be sorted):

mate
tame
mote
team
tome

sorted "this way" (horizontally):

aemt, mate
aemt, tame
emot, mote
aemt, team
emot, tome

sorted "that way" (vertically):

aemt, mate
aemt, tame
aemt, team
emot, mote
emot, tome

lookup "team" -> "aemt"

aemt, mate
aemt, tame
aemt, team

As far as hashtables/tries they only come into the picture if you want a slightly speedier lookup. Using hash tables you can partition the 2-column vertically sorted table into k-partitions based on the hash of the first column. this will give you a constant factor speedup because you have to do a binary search only within one partition. tries are a different way of optimizing by helping you avoid doing too many string comparisons, you hang off the index of the first row for the appropriate section of the table for each terminal in the trie.

thanks for the explanation but I am not understanding how you got from the Input section to the sorted "this way" section? How did you algorithmically find the second column? My question is to find all the valid words that i can find using the letters in the word: Team. — Allen, Oct 26 '13 at 00:17
@Allen `elephant -> aeehlnpt`, `xerox -> eorxx`, `thanks -> ahknst`, `for -> for`, `the` -> `eht`, `explanation -> aaeilnnoptx` .. get it? — necromancer, Oct 26 '13 at 00:23

score 4 · Answer 2 · answered Oct 27 '13 at 07:49

Hash tables are not the best solution, so I doubt you would be required to use them.

The simplest approach to finding anagram pairs (that I know of) is as follows:

Map characters as follows:

a -> 2 b -> 3 c -> 5 d -> 7

and so on, such that letters a..z are mapped to the first 26 primes.

Multiply the character values for each character in the word, lets call it the "anagram number". Its pretty easy to see TEAM and TAME will produce the same number. Indeed the anagram values of two different words will be the same if and only if they are anagrams.

Thus the problem of finding anagrams between the two lists reduces to finding anagram values that appear on both lists. This easily done by sorting each list by anagram number and stepping through to find common values, in nlog(n) times.

jmj · Answer 3 · 2013-10-25T23:11:17.963

0

String to char[]
sort it char[]
generate String from sorted char[]
use it as key to HashMap<String, List<String>>
insert current original String to list of values associated

for example for

car, acr, rca, abc it would have

acr: car, acr, rca
abc: abc

edited Oct 25 '13 at 23:11

answered Oct 25 '13 at 22:40

jmj

237,923
42
401
438

This what I have been searching for but I am unclear on your steps. So for example, using the word Team and your steps: 1. 'T' 'E' 'A' 'M' 2. 'A' 'E' 'M' 'T' 3. AEMT 4. There is only 1 item in the Map, AEMT – Allen Oct 25 '13 at 22:48
Sorry I pressed entered before I finished typing. So for example, using the word Team and your steps: 1) 'T' 'E' 'A' 'M' 2) 'A' 'E' 'M' 'T' 3) "AEMT" 4. There is only 1 item in the Map, AEMT Bullet 4: what is in your List of strings? Based on your steps, it seems that the Map only contain 1 item with a key of "AEMT" – Allen Oct 25 '13 at 23:00
based on my step there would be `{AEMT: {TEAM,AEMT}}` – jmj Oct 25 '13 at 23:02

Anagram Algorithm using a hashtable and/or tries

3 Answers3

Linked