0

I need to optimise a search engine. What is does is to find all possible 2 to n-letter words, by making all possible combinations like this

(for 2 letter words) w = any letter can be on 1-st letter spot + any letter left (but the 1-st) for the second spot ; checkIfIsWord(w)

(for n letter words) n1 + n2 + n3 + n4 + ... n ; checkIfIsWord(w)

That is working, but is quite time consuming. Please help me with idea how to make it faster!

Here is the code:

String w = "";
    for (int i = 0; i < letters.length; i++)
    {
        for (int j = 0; j < letters.length; j++)
        {
            if (i == j) continue;
            w = "" + (char) letters[i] + (char) letters[j];
            checkIfIsWord(w);
            for (int k = 0; k < letters.length; k++)
            {
                if (i == k || j == k) continue;
                w = "" + (char) letters[i] + (char) letters[j] + (char) letters[k];
                checkIfIsWord(w);
                for (int m = 0; m < letters.length; m++)
                {
                    if (i == m || j == m || j == m || k == m) continue;
                    w = "" + (char) letters[i] + (char) letters[j] + (char) letters[k] + (char) letters[m];
                    checkIfIsWord(w);
                    ...
                }
            }
        }
    }

Method checkIfIsWord

void checkIfIsWord(String w) 
{ 
    if (w.length() > 2 
        && words.contains(w.toLowerCase()) // (1) 
        && !allWords.contains(w)) 
    { 
       allWords.add(w); 
        runOnUiThread(updateMaxWords); 
    } 
}
Thrakbad
  • 2,728
  • 3
  • 23
  • 31
  • what does your method `checkIfIsWord` do? – Thrakbad Mar 07 '13 at 14:37
  • It seems you really need to make a recursive function. http://danzig.jct.ac.il/java_class/recursion.html. Also `if (i == j) continue;` means you disallow combination of "aa", "bb", "cc" ? – Timmetje Mar 07 '13 at 14:39
  • Recursion won't really speed up the process, it would just make the code a lot easier to maintain and read – Thrakbad Mar 07 '13 at 14:42
  • method checkIfIsWord just compare the w string to pre-defined string – user2144587 Mar 07 '13 at 14:45
  • If he wouldn't make it recursive he implements the maximum possible letters. Each letter 1 loop. If the biggest word possible is 40 letters, he would need to type 40 loops, doing basically the same. A recursive function would just continue until the word ends. So in my opinion recursion is a must. – Timmetje Mar 07 '13 at 14:45
  • "aa", "bb", "cc" are not my target I should check just "ab", "ac", "ad", ... "abs" – user2144587 Mar 07 '13 at 14:47
  • http://stackoverflow.com/questions/3695019/algorithm-to-generate-all-letter-combinations-using-recursion http://stackoverflow.com/questions/9521729/get-all-possible-combinations-of-characters-in-an-array – Timmetje Mar 07 '13 at 14:48

1 Answers1

1

If you have a list of predefined Strings, as I gathered from your comment, you should just check it the other way around. Iterate over all words in the list and store those that match your criteria. This would only have linear complexity.

In your method checkIfIsWord:

void checkIfIsWord(String w) 
{ 
    if (w.length() > 2 
        && words.contains(w.toLowerCase()) // (1) 
        && !allWords.contains(w)) 
    { 
        allWords.add(w); 
        runOnUiThread(updateMaxWords); 
    } 
}

The line marked with (1) checks your current word w agains all entries currently in words. That's what .contains() does internally. This means in your result-list allWords you can only have a sub-set of the values stored in words. The faster implementation would definitely be the following:

for(String word : words)
{
    if(word.length() > 2
       && word.length() < n)
    {
        allWords.add(word);
        runOnUiThread(updateMaxWords);
    }
}    

Now if you say that a String array with 16k entries would consume a lot of memory, this is correct. But you have the same problem with your original solution, because the line marked with (1) will only allow words that are already in your list words to be part of the resulting set. If you want to tackle that problem, I suggest moving the words to a database instead of keeping them all in the RAM.

Thrakbad
  • 2,728
  • 3
  • 23
  • 31
  • That will not make it faster. For 7 letter words for example there are about 16000 words With my method it will take 7 * 6 * 5 * 4 * 3 * 2 = 5040 checks – user2144587 Mar 07 '13 at 14:54
  • @user2144587 currently it will take 5040 checks against each of your 16000 predefined words, so about 80 million comparisons. If you just check the pre-defined words you will only have to check each once – Thrakbad Mar 07 '13 at 14:59
  • void checkIfIsWord(String w) { if (w.length() > 2 && words.contains(w.toLowerCase()) && !allWords.contains(w)) { allWords.add(w); runOnUiThread(updateMaxWords); } } If I split the String "words" into array with 16000 members this will take all the phone's memory – user2144587 Mar 07 '13 at 15:09
  • I updated my answer (and your question with your provided code) – Thrakbad Mar 07 '13 at 15:21
  • I am so sorry for the confusion "words" is a String, not a list or array (it is a array of chars ;) – user2144587 Mar 07 '13 at 15:23
  • ah, then I totally misunderstood your setup. And what does words actually contain? – Thrakbad Mar 07 '13 at 15:25
  • example for the String words = "abandon abase abash abashed abate ... zoo zoology zoom" – user2144587 Mar 07 '13 at 15:28
  • Okay, actually splitting the `String` wouldn't take up more memory than using the original `String`. It would probably even save memory, because you don't need to save the spaces. You should consider using a list or an array instead of the `String`, the memory consumption is not lower, just because you use one gigantic `String`. Also `ArrayList.contains()` will run A LOT faster than `String.contains()` – Thrakbad Mar 07 '13 at 15:33
  • Thank you, I will try that and I will come back with the results! Thank you very much! – user2144587 Mar 07 '13 at 15:44
  • Changing it from a string to Array did make it faster - with 16-17 % :) And the RAM was not affected, not more, but not less also. Thank you all! – user2144587 Mar 07 '13 at 16:27
  • Have you tried it the way I suggested in my answer or just replaced the string with an array in your original method? – Thrakbad Mar 07 '13 at 16:33