1

I'm having some problems in my application.

Basically, I need to read some string from the user (they'll type an amount of letters mixed), and my job is to find a word that I can make using some letters they gave me.

Here's an example:

User types "emmfosor". Thinking fast, I can make 2 words out of it: 'some' and 'from', but the letters of these words are mixed.

However, to make my job easier, I have a database of words. It's on a txt file like this:

apple
strawberry
boring
buildings
book
superior
bathroom

Every solution that I make is wrong, or the words just don't appear. Can somebody help me?

Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
  • Maybe you could take a look to the Hamming distance to try to recover a wrong typed word by comparing it to your dictionnary, but i don't know if the accuracy will be enough for you – thibsc Apr 02 '21 at 01:02
  • Is the number of words in the input fixed? – ETO Apr 02 '21 at 01:05
  • 1
    Why not go word by word in your txt file and go through each word and see if each letter is in the given Strinng? If at least one letter is not in the given String, then that word cannot be made from the given String. – LuckyBandit74 Apr 02 '21 at 01:07
  • Related: [Anagram algorithm in java](https://stackoverflow.com/questions/13692221/anagram-algorithm-in-java) - this may give you some ideas which you can adapt to your situation. – andrewJames Apr 02 '21 at 01:12
  • Convert both input and dictionary to sorted char arrays `dog` -> `dgo` `black` -> `abckl` (then merge to one list `abcdgklo`) User input `dogblack` -> `abcdgklo` == match. Maybe you can streanline this by only choosing word combinations that match in length e.g. input length is 8 so choose word 1&7, 2&6 3&5 etc. – Scary Wombat Apr 02 '21 at 01:16
  • @ScaryWombat Using a `HashMap` would make things much easier. – ETO Apr 02 '21 at 01:21
  • 1
    @ETO Thanks, I have purposely not suggested any code as the OP has shown no code. – Scary Wombat Apr 02 '21 at 01:24
  • Take the user input and compare it to each word on the file. If all the letters of the are on the user input, you can say it is a match. – m0skit0 Apr 02 '21 at 01:24
  • Sort all words, and store them in a `Map`, where the value is the original word. Sort the input string, then generate combinations. Because the input has been sorted and combination "words" should be sorted, you can simply iterate `bitmask = 1 ... 2^N` and use `bitmask` to represent the "word": each bit 1/0 indicates an input letter is/isn't in the sorted "word". Throw out all "words" that are too short/long ([count the 1 bits](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/Integer.html#bitCount(int))), and for the rest, `map.get(word)` is either your answer or `null`. – AndrewF Apr 02 '21 at 01:45

3 Answers3

1

This is not the efficient way to do this. For huge word database it's not suitable. But we could try something like this.
For each character in input, remove the character from all words in the words list. At the end the word list will have 0 length strings which are match.
Following is my basic implementation of this. You can use StringBuilder, Maps, removing duplicate chars from input etc. to improve performance.

public class Test {

   public static void main(String[] args) throws Exception {
       String[] data = {"some", "zoo", "from"};
       String input = "emmfosor";
       System.out.println("input: " + input);
       System.out.println("possible words: "+getMatched(input, data));
    }
   
   public static List<String> getMatched(String input, String[] data){
       List<String> result = new ArrayList<>();
       //make a copy to work on
       String[] copy = Arrays.copyOf(data, data.length);
       char[] chars = input.toCharArray();
       for(char c : chars) {
           for (int i = 0; i < copy.length; i++) {
            String word = copy[i];
            if(!word.isBlank())
                copy[i] = word.replaceAll(Character.toString(c), "");
        }
       }
       for (int i = 0; i < copy.length; i++)
         if(copy[i].isBlank())     //find empty strings
             result.add(data[i]);
           return result;
   }
}

Output:

input: emmfosor
possible words: [some, from]

Modify the code as per your needs.

the Hutt
  • 16,980
  • 2
  • 14
  • 44
0

If you are sure that the input contains a single word, then there is an easy linear trick.

  1. Define a method for sorting letters inside a string. The String is immutable, so the method will return a new object, obviously:
public static String sortString(String inputString){
    char tempArray[] = inputString.toCharArray();
    Arrays.sort(tempArray);
    return new String(tempArray);
}
  1. Sort the letters of all the strings from your file and put them into a map:
Map<String, String> map = Files.lines(Paths.get("filename"))
                               .collect(Collectors.toMap(s -> sortString(s), s -> s));
  1. Now, once you have everything prepared, you can accept the input:
String inputString = // read it somehow

String outputString = map.get(sortString(inputString));
ETO
  • 6,970
  • 1
  • 20
  • 37
  • 1
    *User types "emmfosor". Thinking fast, I can make 2 words out of it:* The OP seems aware that the input will not contain one word. – Scary Wombat Apr 02 '21 at 01:21
0

Well, you first have to find a way to get the contents of that .txt file, and put it in some kind of array. Here is my best method to make that:

class ParseText extends File {
  private Scanner readSelf;
  private String content;

  public ParseText(String filename) {
    super(filename);
    readSelf = new Scanner(this);
    while(readSelf.hasNextLine()) {
      content += readSelf.nextLine();
    }
    readSelf.close();
  }

  public String[] getEntries() {
    ArrayList<String> entries;
    while(true) {
      int firstBreak = content.indexOf('\n');
      boolean notLast = (firstBreak != -1);
      // ternary operator
      entries.add(content.substring(0, notLast ? (firstBreak - 1) : content.length() - 1));

      //more ternary operators
      content = notLast ? (content.substring(firstBreak + 1)) : "";
      if(content.length() == 0) {break};
    }
    return entries.toArray(new String[0]); 
  }

  public static HashMap<char, int> getCharacterHashMap(String string) {
    HashMap<char, int> charHash = new HashMap<char, int>();

    for(char c : string.toCharArray()) {
      if(charHash.containsKey(c)) {
        int old = charHash.get(c);
        charHash.replace(c, old, old + 1);
      }
      else {
        charHash.put(c, 1);
      }
    }

    return charHash;
  }
}

Next, an algorithm. We should start to use hashmaps here, since we want to make an un-scrambler for Strings. Each HashMap object should have characters as keys, and integers as values.

class Unscramble {
  private static String[] entries = new ParseText("entries.txt").getEntries();
  
  public static match(String someGibberish) {
    Object gibberishHash = ParseText.getCharacterHashMap(someGibberish);
    ArrayList<int> passes = new ArrayList<int>();
    for(int i = 0; i < entries.length; i++) {
      
      HashMap<char, int> currentHash = ParseText.getCharacterHashMap(entries[i]);
      char[] keyArray = currentHash.keySet().toArray();
      boolean[] fits = new boolean[keyArray.length];
      for(int j = 0; j < keyArray.length; i++) {
        if(gibberishHash.get(keyArray[j]) != null && currentHash.get(keyArray[j]) <= gibberishHash.get) {
          fits[j] = true
        }
        else {
          //breaking to speed up running
          break;
        }
      }

      boolean passesEntry = true;
      for(int k = 0; k < fits.length; i++) {
        if(!(fits[k])) {
          passesEntry = false;
          break;
        }
      }

      if(passesEntry) {
        passes.add(i);
      }
    }
  }
}

I haven't tested this for any errors, so just comment to inform me of any errors. You might also be able to speed this up more with threads, but this should be enough.

TheSmolCodar
  • 47
  • 1
  • 7
  • 1
    You can get `List` of words, from a file easily like this: `List database = Files.readAllLines(Path.of("words.txt"));` – the Hutt Apr 02 '21 at 02:51
  • thank you @onkarruikar, then I can just remove the ParseText class, and just add a property to the Unscramble class. – TheSmolCodar Apr 02 '21 at 02:54