0

Given two List sentence and List queries

I have two find the occurrence of queries in sentences.

Example,

  1. "Pulkit likes StackOverflow and coding"

  2. "Pulkit does not like Reddit"

  3. "Pulkit like ice cream"

Queries

  1. Pulkit coding

  2. like

  3. does

The function should return for the queries

  1. sentence[0]

  2. sentence[1], sentence[2]

  3. sentence[1]

I solved this question already using HashMap but it is quadratic and I am wondering how to do it in linear time.

Solution

     public static void findMatch(List<String> sentences, List<String> queries) {
        // Write your code here
        // Split the sentences into terms and map them by index
        Map<Integer, Set<String>> sentencesSplit = new HashMap<>();
        for (int j = 0; j < sentences.size(); j++) {
            String[] splitSentence = sentences.get(j).split(" ");
            Set<String> sentenceSet = new HashSet<>();
            sentencesSplit.put(j, sentenceSet);
            for (int i = 0; i < splitSentence.length; i++) {
                sentenceSet.add(splitSentence[i]);
            }
        }

        // Split the query into terms and map them by index
        Map<Integer, String[]> queriesSplit = new HashMap<>();
        for (int i = 0; i < queries.size(); i++) {
            queriesSplit.put(i, queries.get(i).split(" "));
        }

        for (int i = 0; i < queries.size(); i++) {
            String found = null;
            for (int j = 0; j < sentences.size(); j++) {
                String[] splitQuery = queriesSplit.get(i);
                Set<String> sentenceStringList = sentencesSplit.get(j);
                boolean notFound = false;
                for (int k = 0; k < splitQuery.length; k++) {
                    if (!sentenceStringList.contains(splitQuery[k])) {
                        notFound = true;
                        break;
                    }
                }
                if (!notFound) {
                    if (found == null) {
                        found = "" + j;
                    } else {
                        found += " " + j;
                    }
                }
            }
            if (found == null) {
                found = "-1";
            }
            System.out.println(found);
        }
    }
JunJie Wang
  • 460
  • 3
  • 10
Pulkit
  • 65
  • 1
  • 3
  • 9
  • You're basically asking how to implement full text search, or something along those lines. You would need to index the source text, e.g. by putting into a B-tree. Then, querying with a given keyword would only be a `lg(N)` operation, so the entire search would only take `M*lg(N)`, where `M` is the number of keywords to search. – Tim Biegeleisen Oct 25 '18 at 02:38
  • @TimBiegeleisen can you provide examples or references where I can understand how to do this, please? – Pulkit Oct 25 '18 at 02:41
  • 2
    You're asking a lot here; probably too much for a single Stack Overflow question. Just do some research on indexing text by keyword. That should turn up a lot, and you can add `Java` to the search for something specific to Java. – Tim Biegeleisen Oct 25 '18 at 02:42
  • https://stackoverflow.com/questions/955110/similarity-string-comparison-in-java – JunJie Wang Oct 25 '18 at 02:45
  • @ScaryWombat you are more than welcome to not reply. Please only reply if you can motivate or provide some help! no need of hate or demotivation here. I am reporting you – Pulkit Oct 25 '18 at 02:46
  • 1
    @Pulkit ha ha ha - the comedy site is somewhere else. BTW if you check your previous questions you will actually see that I have positively inputted to them – Scary Wombat Oct 25 '18 at 02:49
  • wait my answer, i think it can be resolved by using Pattern. – JunJie Wang Oct 25 '18 at 02:52

1 Answers1

1

My code is similar with human's thinking.

\b allows you to perform a "whole words only" search using a regular expression in the form of \bword\b.

Hope my code will help you.

public class MainClass {

    public static void main(String [] args)
    {
        List<String> sentences = new ArrayList<String>();
        sentences.add("Pulkit likes StackOverflow and coding");
        sentences.add("Pulkit does not like Reddit");
        sentences.add("Pulkit like ice cream");

        List<String> queries = new ArrayList<String>();
        queries.add("Pulkit coding");
        queries.add("like");
        queries.add("does");

        findMatch(sentences, queries);
    }

    public static void findMatch(List<String> sentences, List<String> queries) {
        for(String query : queries) {
            System.out.print("]");

            String match = ".*\\b" + query.replace(" ", "\\b.*") + "\\b.*";             
            for (int iSentence = 0; iSentence < sentences.size(); iSentence++) {
                if(sentences.get(iSentence).matches(match)) {
                    System.out.print(" " + iSentence);
                }
            }

            System.out.println("");
        }
    }
}

Console output:

] 0
] 1 2
] 1
JunJie Wang
  • 460
  • 3
  • 10