2

I'm looking for an algorithm that can be used to compare two sentences are provide a matching score. For example,

INPUT

Sentence Input 1: "A quick brown fox jumps over the lazy dog"

Sentence Input 2: "Alpha bravo dog quick beta gamma fox over the lazy dug"

OUTPUT

Output: "A quick brown fox jumps over the lazy dog"

Score: 5 out of 9 (words found in correct order)

Can someone please point me in the right direction?

Mustafa
  • 20,504
  • 42
  • 146
  • 209
  • Something in difflib? – Daniel Hao Sep 10 '21 at 13:10
  • 2
    Consider the string edit distance (with atomic words). https://en.wikipedia.org/wiki/Levenshtein_distance –  Sep 10 '21 at 13:19
  • Related: [Where can I find the histogram diff algorithm?](https://stackoverflow.com/questions/63628162/where-can-i-find-the-histogram-diff-algorithm-on-internet/) – Stef Sep 10 '21 at 13:38

3 Answers3

0

Can you try this c++ based algorithm to solve the above problem?

int MatchingScore(string s1, string s2) {
      int count = 0;
      unordered_map<string, int> my_map;
      stringstream ss1(s1);
      stringstream ss2(s2);
      vector<string> tokens1;
      vector<string> tokens2;
      string temp_str;

      // tokenise strings s1, s2 based on space char
      while(getline(ss1, temp_str, ' ')) {
         tokens1.push_back(temp_str);
      }
      while(getline(ss2, temp_str, ' ')) {
         tokens2.push_back(temp_str);
      }

      // push words of string1 to hash_map
      for(auto s: tokens1) {
          my_map[s]++;
      }
      // while iterating through string2 check if word already present in hash_map
      for(auto s: tokens2) {
          if(my_map.find(s) != my_map.end()) {
              count++;
          }
      }
      return count;
  }
0

You can try this java code:

    String sentence = "A quick brown fox jumps over the lazy dog";
    String searchPhrase = "Alpha bravo dog quick beta gamma fox over the lazy dug";

    String[] searchwords = searchPhrase.trim().split(" ");
    int score = 0;

    for (int i = 0; i<searchwords.length ;i++) {
        if (sentence.toLowerCase().indexOf(searchwords[i].toLowerCase()) != -1) {
            score +=1;
        } else {
            score +=0;
        }
    }

    System.out.println("Score: " + score + " out of " + searchwords.length);
0

A working solution, Tested with variations. Need to be optimized with regex, etc

Output:

Score: 5 out of 9
A *quick* brown *fox* jumps *over* *the* *lazy* dog

Java code is written in Android Studio junit

String sentence = "A quick brown fox jumps over the lazy dog";
    String searchPhrase = "Alpha bravo dog quick beta gamma fox over the lazy dug";

    String[] searchWords = sentence.split(" ");
    int score = 0;

    String outputString = sentence;

    for (String searchWord : searchWords) {
        String spacedPhrase = " " + searchPhrase + " ";
        String spacedWord = " " + searchWord.toLowerCase() + " ";
        if (spacedPhrase.toLowerCase().contains(spacedWord)) {
            score += 1;
            searchPhrase = searchPhrase.substring(spacedPhrase.indexOf(spacedWord) + 1);
            outputString = (" " + outputString + " ").replaceFirst(" "+ searchWord + " ",
                    " *" + searchWord + "* ");
        }
    }

    System.out.println("Score: " + score + " out of " + sentence.split(" ").length);
    System.out.println(outputString.trim());
    assertEquals("A *quick* brown *fox* jumps *over* *the* *lazy* dog", outputString.trim());
Qamar
  • 4,959
  • 1
  • 30
  • 49