0

I need to read the user input and compare this to a dictionary.txt. The user may input any number of characters and the program must return all the words in the English language that can be made from these characters. The letters can be used in any order and may only be used once. For example:

User Input: "odg"

Output: "dog" , "god" ... and any others

After quite a substantial amount of research, I have come up with the following partial solution:

  1. Read user input
  2. Convert to an array of characters
  3. Loop through the document depending on array length
  4. Using indexOf to compare each character in this array to each line, then printing the word/s which do not return -1

How do I compare a set of characters inputted by the user to those found in a text file (dictionary) ? The characters do not have to be in any order to match .(as seen in the example used above)

Bear with me here, I know this must be one of the most inefficient ways to do such a task! Any further ideas on how to implement my original idea would be appreciated, while I am also open to any new and more efficient methods to perform this operation.

Below is what I have come up with thus far:

  public static void main(String[] args) throws FileNotFoundException {
    BufferedReader reader1 = new BufferedReader(new FileReader(FILENAME));
    Scanner sc = new Scanner(System.in);
    String line;
    ArrayList<String> match = new ArrayList<>();

    System.out.println("Enter characters to see which english words match: ");
    String userInput = sc.next();

    char arr[]  = userInput.toCharArray();
    int i;

        try {

            while ((line = reader1.readLine()) != null) {

                for (i=0; i < arr.length; i++)
                {
                   if ((line.indexOf(userInput.charAt(i)) != -1) && (line.length() == arr.length)) {
                       match.add(line);
                    }
                    else {
                //        System.out.println("no matches");
                    }
                }

            }
            System.out.println(match);
        }

    catch (IOException e) {

        e.printStackTrace();

    }

**Current results: **

Words in text file:

cab
dog
god
back
dogs
quick

User input: "odg"

Program output:

[god, god, god, dog, dog, dog]

The program should return all words in the dictionary that can be made out of the string entered by the user I am managing to return both instances in this case, however, each are displayed for three times (arr.length).

  • What is your question? What is the *specific* problem with your code? *Questions seeking debugging help ("why isn't this code working?") must include the desired behavior, a specific problem or error and the shortest code necessary to reproduce it in the question itself. Questions without a clear problem statement are not useful to other readers.* – tnw Apr 05 '17 at 19:04
  • The exact question states: A user inputs a number of characters. • The program then lists all the words in the English language that can be made from those characters. • Each letter can only be used once. • The letters can be used in any order. This is not homework of any sort, I am just trying to figure out this problem at hand. I am not asking why my code isn't working but simply suggestions on how I can compare each character entered by the user to those found in a dictionary text file. –  Apr 05 '17 at 19:07
  • We didn't ask for *code requirements* but *description of problem* which *you* are facing while writing it. Do you get any error, exception, incorrect results? – Pshemo Apr 05 '17 at 19:10
  • Sorry for my lack of information given. I updated the question slightly. No errors only incorrect results. I will update the question further. Thanks –  Apr 05 '17 at 19:12
  • @Rizzo Your question is too broad. That's the entire assignment. What is the specific problem and question **regarding the code you've written**. – tnw Apr 05 '17 at 19:14
  • Does the edit help ? Can I provide any more useful information? I am unable to match **all** characters to a single string in the dictionary. The output is returning even single string where a character matches. –  Apr 05 '17 at 19:16
  • 1
    Is your question "how to check if word can be created from characters in other word"? If yes then take a look at [Check if letters that a word consists of are present in another string](http://stackoverflow.com/questions/22431351/check-if-letters-that-a-word-consists-of-are-present-in-another-string.) – Pshemo Apr 05 '17 at 19:25
  • 2
    Anyway you should try separating specific tasks in your code. For instance you can create method `boolean test(characters, word)` where you will do your validation. If result will be true then you can print word. Currently you are printing it for each letter validation, not for entire process. – Pshemo Apr 05 '17 at 19:27
  • Question: Will you allow the user to perform multiple inputs against the same dictionary, or will there only ever be 1 input per run of program? – domsson Apr 05 '17 at 19:28
  • Answer: There will only be 1 input per run of program. The program then returns all words in the dictionary that can be made out of the string entered by the user. –  Apr 05 '17 at 19:29
  • 1
    I agree with @Pshemo - you should use *divide & conquer*. That is, create a method that checks if two strings' characters match (according to your conditions). This also has the advantage that you could basically reduce the code shown to only that method (although the code above is short enough as is IMHO). One more question: in point 3, you said you wanted to compare strings based on array length. I don't see that in your code? – domsson Apr 05 '17 at 19:36
  • You are absolutely right that is not present in my code. That actually provided me with an idea and I proceeded to write: `if ((line.indexOf(userInput.charAt(i)) != -1) && (line.length() == arr.length))` so I am testing that the word lengths actually match. I have narrowed down my error margin. I will update question. –  Apr 05 '17 at 19:42
  • 1
    Possible duplicate of [How to check if two words are anagrams](http://stackoverflow.com/questions/15045640/how-to-check-if-two-words-are-anagrams) – domsson Apr 05 '17 at 20:21

3 Answers3

3

First of all, interesting question. I implemented my solution and Ole V.V's solution. Here are the codes based on your post. I test the only test case you provided, not sure whether this is what you want. Let me know if it is not working as you expected.

Solution One: counting O(nk)

public static void main(String[] args) throws IOException {
    BufferedReader reader1 = new BufferedReader(new FileReader(FILENAME));
    Scanner sc = new Scanner(System.in);

    System.out.println("Enter characters to see which english words match: ");
    String userInput = sc.next();

    Map<Character, Integer> counter = count(userInput);
    String line;
    while ((line = reader1.readLine()) != null) {
        Map<Character, Integer> lineCounter = count(line);
        if(lineCounter.equals(counter)) {
            System.out.println(line);
        }
    }
}

public static Map<Character, Integer> count(String input) {
    Map<Character, Integer> result = new HashMap<Character, Integer>();
    for (char c: input.toCharArray()) {
        result.putIfAbsent(c, 0);
        result.put(c, result.get(c) + 1);
    }

    return result;
}

Solution Two: sorting O(nk)

public static void main(String[] args) throws IOException {
    BufferedReader reader = new BufferedReader(new FileReader(FILENAME));
    Scanner sc = new Scanner(System.in);

    System.out.println("Enter characters to see which english words match: ");
    String userInput = sc.next();
    userInput = sort(userInput);

    String line;
    while ((line = reader.readLine()) != null) {
        String sortedLine = sort(line);
        if(sortedLine.equals(userInput)) {
            System.out.println(new String(line));
        }
    }
}

// counting sort
public static String sort(String input) {
    char c[] = input.toCharArray();
    int length = c.length;
    char output[] = new char[length];

    int count[] = new int[256];
    for (int i = 0; i < length; i++) {
        count[c[i]] = count[c[i]] + 1;
    }

    for (int i = 1; i <= 255; i++) {
        count[i] += count[i - 1];
    }

    for (int i = 0; i < length; i++) {
        output[count[c[i]] - 1] = c[i];
        count[c[i]] = count[c[i]] - 1;
    }

    return new String(output);
}
Junbang Huang
  • 1,927
  • 19
  • 26
  • the time complexity of this one is O(nk) where n is the size of the dict file and k is the maximum length of a word. – Junbang Huang Apr 05 '17 at 19:45
  • First of all, thanks for your contribution. I have updated my question and I am now comparing the character length of the string entered by the user directly to the strings present in the dictionary. I will test your solution further. Thanks! –  Apr 05 '17 at 19:46
1

The standard solution to this kind of problem is: sort the characters of the user input. So odg will become dgo and back will become abck. For each word in the dictionary, do the same sorting. So cab will become abc and dog will be dgo — hey, that’s the same as the first user input, so now we know that this word should be output.

The strong point with this solution is you make sure every letter is used exactly once. It even takes duplicate letters into account: if the same letter comes twice in the user input, it will only find words that also contain that letter exactly twice.

If you like, you can prepare your word list in advance by building a map where the keys are the alphabetically sorted words and the values are lists of words that contain those same letters. So key dgo will map to a list of [dog, god]. Then you just have to sort the input and make a lookup.

Ole V.V.
  • 81,772
  • 15
  • 137
  • 161
  • Hi, thanks for your input on this matter! Would you suggestion still be valid for a dictionary containing say, 100k words ? Thanks –  Apr 05 '17 at 19:47
  • Yes it would. If it were me I would pretty soon want to measure the time it takes to check the user input against each of 100k words to see if the preparation into a map is necessary. Such a preparation should cut response time rather drastically. – Ole V.V. Apr 05 '17 at 19:50
0

I'll show you a solution that is easy to understand and implement but not the fastest available:

Possible solution: Array sorting

Treat input string and dictionary word as array of chars, sort them, then compare them:

public static boolean stringsMatchSort(String a, String b) {
    // Different length? Definitely no match!
    if (a.length() != b.length()) {
        return false;
    }

    // Turn both Strings to char arrays
    char[] charsA = a.toCharArray();
    char[] charsB = b.toCharArray();

    // Sort both arrays
    Arrays.sort(charsA);
    Arrays.sort(charsB);

    // Compare them, if equal: match!
    return Arrays.equals(charsA, charsB);
}

Note how I made the meat of your program / problem into a method. You can then easily use that method in a loop that iterates over all words of your dictionary. The method doesn't care where the words come from: a file, a collection, additional user input, the network, etc.

It also helps to simplify your program by dividing it into smaller parts, each with a smaller responsibility. This is commonly known as divide & conquer and is one of the most valuable strategies for both, new and old programmers alike, when it comes to tackling complicated problems.

Other solutions: Prime numbers, HashMaps, ...

There are other (including faster and more elegant) solutions available. Take a look at these related questions, which yours is pretty much a duplicate of:

Additional notes

Depending on your application, it might be a good idea to first read the dictionary into a suitable collection. This would be especially helpful if you perform multiple "queries" against the same dictionary. Or, if the dictionary is really huge, you could already strip out duplicates during the creation of the collection.

Community
  • 1
  • 1
domsson
  • 4,553
  • 2
  • 22
  • 40
  • Thanks for your help! I think that the first solution is neater and easier to understand. I will try to implement this. Thanks again. –  Apr 05 '17 at 19:59
  • 1
    The iteration will accept `pops` and `oops` as matching; it may not be what is desired. – Ole V.V. Apr 05 '17 at 20:09
  • @OleV.V. you are completely right, very good observation! Thanks for pointing it out. I shall edit shortly. – domsson Apr 05 '17 at 20:14
  • I ended up deleting my second solution, it would've become too convoluted compared to the first. Instead, I marked the question as duplicate and referred to it in my answer - there, a very elegant solution is available. – domsson Apr 05 '17 at 20:27