1

I've been trying to code an anagram finder in Java so that in the terminal after compiling all I would have to do is

  1. Type in java Anagramfind list.txt
  2. When prompted type in a word, say treasure
  3. The program prints an anagram, like austerer
  4. Another prompt comes up asking if I would like another one (yes/no)

The list.txt file has most, if not all, of the words in the English language.

Here's what I have so far...

import java.util.*;
import java.io.*;

class ProjectAnagram {
    public static void main (String[] args throws IOException) {
        //THis here declares an array of strings
        Scanner dictionary = new Scanner new (fileInputStream(args[0]));
        String[] entireArray = new String[173528]; //name of array + 173258
        System.out.println ("Put something in please");
        Scanner keyboard = new Scanner(System.in);
        System.out.println ("Inserted");

        String word = keyboard;
    }

I still need to add the rest.

I've mostly been having trouble with the use of arrays, which I referenced here:

I also am having trouble using a Stringbuffer which to check if words have the same characters or not.

The program checks if the input string and if the string in the text file have the same length first so as to rule out obvious non-anagrams. If it doesn't, then it moves on to the next word in the list, probably with i++ in some loop.

Community
  • 1
  • 1
J. Doe
  • 31
  • 6
  • Your code is extremely uncompilable. Please fix your code in your post. There are plenty syntactical errors. – Vampire Apr 30 '16 at 23:07
  • It's actually pseudocode, lol – J. Doe Apr 30 '16 at 23:19
  • If you ask a Java question, please post Java code. This "pseudo-code" is for sure not valid in any question as there are unbalanced parantheses, unbalanced curly braces and so on. Please fix the code in your question. – Vampire Apr 30 '16 at 23:34
  • 1
    If you're planning to go over every element (word in the english language) you should really consider my answer. Storing the dictionary in an array then using the other algorithms below is fairly inefficient especially if the dictionary is alphabetized. – ChiefTwoPencils May 01 '16 at 01:56

4 Answers4

1

Similar to finding if two strings are permutations of each other, you can sort the characters of the given string and make it the key to a list of anagrams. This way no matter the string you will find only strings of the same length comprised of the same characters.

Something like:

Map<String, List<String>> map ...
map.get(getKey(string)).get(i); // i = the ith request for an anagram
ChiefTwoPencils
  • 13,548
  • 8
  • 49
  • 75
1
Arrays.equals('Testing'.chars().sorted().toArray(), 'ingsetT'.chars().sorted().toArray())
Vampire
  • 35,631
  • 4
  • 76
  • 102
  • Am I mistakenly misunderstanding the difference between a permutation and an anagram; don't they need to be valid words? – ChiefTwoPencils Apr 30 '16 at 23:28
  • Granted, `ingsetT` is not a valid anagram, but each anagram is just a permutation that is a new word and OP works with a wordlist. – Vampire Apr 30 '16 at 23:30
  • So in place of `Testing` and `ingset` I could replace it with variables `word` and `entirearray` from above, respectively? – J. Doe Apr 30 '16 at 23:51
  • I'm not sure, becuase you still didn't fix your code above so it is unclear what is what. But if `word` is the word input by the user and `entirearray` is one of the words of the wordlist, then yes. But form the naming I guess `entirearray` is the whole wordlist, so you would loop over `entirearray` and check each entry with that line, while of course you should not do the `...toArray()` stuff for the input word for each wordlist word, but do it once and reuse. – Vampire Apr 30 '16 at 23:58
  • And if you often check through the whole wordlist and have enough memory it could be worth calculating the sorted arrays of the wordlist words once after reading the wordlist with a mapping to the original word and then only do the searching repeatedly. You could then even sort the sorted list and use a binary search algorithm or you could pimp your wordlist file with the sorted character to word mapping so that you don't have to do it on each program launch. There are plenty ways to optimize this. – Vampire Apr 30 '16 at 23:58
  • I removed the unintelligible parts of my code and added that I needed to add missing material :). So say I wanted to take one word from `entireArray`, how would I reference that in the methods? – J. Doe May 01 '16 at 00:07
  • I'm not sure what you are asking. If you want to take the 5th word of `entireArray`, that would be `entireArray[4]`. But if that was your question, please go and read a book about Java basics. ;-) – Vampire May 01 '16 at 00:18
  • Sorry, poorly worded question. But you still managed to answer it, so thanks! I though if I was to put in a method afterwards via a `.` I would need parentheses, not brackets. – J. Doe May 01 '16 at 00:46
  • Yes, you need parens for a method call, but array element access is no method. As I said, read a book about Java basics. – Vampire May 01 '16 at 11:42
1

Try this.

package stackoverflow;

import java.io.*;
import java.util.*;

public class ProjectAnagram {

    static String sort(String s) {
        char[] c = s.toCharArray();
        Arrays.sort(c);
        return String.valueOf(c);
    }

    public static void main(String[] args) throws FileNotFoundException {
        Map<String, List<String>> words = new HashMap<>();
        try (Scanner in = new Scanner(new File(args[0]))) {
            while (in.hasNext()) {
                String word = in.next();
                String sorted = sort(word);
                List<String> list = words.get(sorted);
                if (list == null)
                    words.put(sorted, list = new ArrayList<>());
                list.add(word);
            }
        }
        Scanner in = new Scanner(System.in);
        while (true) {
            System.out.print("Enter word (or press ENTER to quit): ");
            if (!in.hasNextLine()) break;
            String s = in.nextLine();
            if (s.length() == 0) break;
            System.out.println(words.get(sort(s)));
        }
    }
}
0

EDIT: This answer is an efficient way to do it!

Hashmaps are O(1) look up time.

We would need 3 iterations. First to add character count in first string to hashmap, second to remove character count in second string from hashmap, third to iterate through hashmap and see if all the values were 0.

So, this would give us an O(n) algorithm.

Community
  • 1
  • 1
Nishant Roy
  • 1,043
  • 4
  • 16
  • 35
  • 2
    So `OO` is an anagram of `NP`? The hashing algorithm is very weak. It could at most serve as indicator, but you would still have to compare the characters. – Vampire Apr 30 '16 at 23:14
  • @BjörnKautler what if we change it to: `hash += Math.pow(str.charAt(i),2);` – Nishant Roy Apr 30 '16 at 23:17
  • Alternatively, [this answer](http://stackoverflow.com/a/15045892/4509355) seems to work and is efficient – Nishant Roy Apr 30 '16 at 23:25
  • Well, the hashing is better, but I'm relatively certain it also has many conflicts. You can this make even better using CRC or MD5 or SHA-1, but the safer you make, the more time it needs. You can also just use my one-liner, though I didn't test its performance :-) – Vampire Apr 30 '16 at 23:32
  • Your one-liner had sorting which will be O(n log n). The answer I linked to would store each char from the first string in a hashmap, then remove each char in second string from the hashmap, then it would iterate over the hashmap and see if any values were 0> This would be O(n). – Nishant Roy Apr 30 '16 at 23:35
  • As I said, i don't know about its performance, but it will work and it is in just one line. ;-) – Vampire Apr 30 '16 at 23:36
  • @NishantRoy, but all that does is determine if it's a permutation. They need is a way to get one of n anagrams. They also want to be able to get more upon request. – ChiefTwoPencils May 01 '16 at 01:29