And here is my novel solution.
Jon Bentley’s book Programming Pearls contains a problem about finding anagrams of words.
The statement:
Given a dictionary of english words, find all sets of anagrams. For
instance, “pots”, “stop” and “tops” are all anagrams of one another
because each can be formed by permuting the letters of the others.
I thought a bit and it came to me that the solution would be to obtain the signature of the word you’re searching and comparing it with all the words in the dictionary. All anagrams of a word should have the same signature. But how to achieve this? My idea was to use the Fundamental Theorem of Arithmetic:
The fundamental theorem of arithmetic states that
every positive integer (except the number 1) can be represented in
exactly one way apart from rearrangement as a product of one or more
primes
So the idea is to use an array of the first 26 prime numbers. Then for each letter in the word we get the corresponding prime number A = 2, B = 3, C = 5, D = 7 … and then we calculate the product of our input word. Next we do this for each word in the dictionary and if a word matches our input word, then we add it to the resulting list.
The performance is more or less acceptable. For a dictionary of 479828 words, it takes 160 ms to get all anagrams. This is roughly 0.0003 ms / word, or 0.3 microsecond / word. Algorithm’s complexity seems to be O(mn) or ~O(m) where m is the size of the dictionary and n is the length of the input word.
Here’s the code:
package com.vvirlan;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Date;
import java.util.List;
import java.util.Scanner;
public class Words {
private int[] PRIMES = new int[] { 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73,
79, 83, 89, 97, 101, 103, 107, 109, 113 };
public static void main(String[] args) {
Scanner s = new Scanner(System.in);
String word = "hello";
System.out.println("Please type a word:");
if (s.hasNext()) {
word = s.next();
}
Words w = new Words();
w.start(word);
}
private void start(String word) {
measureTime();
char[] letters = word.toUpperCase().toCharArray();
long searchProduct = calculateProduct(letters);
System.out.println(searchProduct);
try {
findByProduct(searchProduct);
} catch (Exception e) {
e.printStackTrace();
}
measureTime();
System.out.println(matchingWords);
System.out.println("Total time: " + time);
}
private List<String> matchingWords = new ArrayList<>();
private void findByProduct(long searchProduct) throws IOException {
File f = new File("/usr/share/dict/words");
FileReader fr = new FileReader(f);
BufferedReader br = new BufferedReader(fr);
String line = null;
while ((line = br.readLine()) != null) {
char[] letters = line.toUpperCase().toCharArray();
long p = calculateProduct(letters);
if (p == -1) {
continue;
}
if (p == searchProduct) {
matchingWords.add(line);
}
}
br.close();
}
private long calculateProduct(char[] letters) {
long result = 1L;
for (char c : letters) {
if (c < 65) {
return -1;
}
int pos = c - 65;
result *= PRIMES[pos];
}
return result;
}
private long time = 0L;
private void measureTime() {
long t = new Date().getTime();
if (time == 0L) {
time = t;
} else {
time = t - time;
}
}
}