0

Prompt: Given a sentence and a set of known abbreviations, figure out an efficient way to shorten the sentence.

 Abbreviations:
 be right back -> BRB
 be right there -> BRT
 be back later -> BBL
 be back soon -> B back soon
 faster than light -> FTL
 be -> B
 later => L8R


 Conversions:
 I will be right there -> I will BRT
 I will be there -> I will B there
 I will go right there -> I will go right there
 I will be there later -> I will B there L8R
 I am faster than you -> I am faster than you
 Never faster than light -> Never FTL
 Faster than light today -> FTL today

Here's my code to this problem. However, I am only able to get one abbreviation in my final answer.

import java.util.*;
class Solution{
    public static void main(String[] args) {
        Map<String, String> dict = new HashMap<>();
        dict.put("be right back", "BRB");
        dict.put("be right there", "BRT");
        dict.put("be right later", "BRL");
        dict.put("be", "B");
        dict.put("later", "L8R");

        String s = "I will be right there later";
        System.out.println(convert(s, dict));
    }

    public static String convert(String s, Map<String, String> dict) {
        String[] words = s.split(" ");
        List<String> converted = new ArrayList<>();

        List<String> toCheck = new ArrayList<>();
        for (int i = 0; i < words.length; i++){
            for (int j = i; j < words.length; j++){
                String[] substring = Arrays.copyOfRange(words, i, j+1);
                String combined = "";
                for (String str : substring){
                    combined += str + " ";
                }
                combined = combined.strip();
                toCheck.add(combined);
            }
        }

        String ans = "";
        String target = "";
        for (String str : toCheck){
            if (dict.containsKey(str)){
                int index = s.indexOf(str);
                ans = s.substring(0, index) + dict.get(str) + s.substring(index + str.length());
            }
        }

        return ans;

    }

}

I think there is a recursive way to perform the conversion, but I am not quite sure how. Can anyone help me with that or direct me to a problem similar to this one? Thanks in advance!

bonus
  • 73
  • 8
  • Possible duplicate of [Java Replacing multiple different substring in a string at once (or in the most efficient way)](https://stackoverflow.com/q/1326682/5221149) – Andreas Feb 05 '20 at 04:06
  • How a bout calling your method until the string does not change anymore? – MrSmith42 Feb 05 '20 at 07:10

2 Answers2

1
ans = s.substring(0, index) + dict.get(str) + s.substring(index + str.length());

This line actually keeps the string intact except for the replacement part. So, only the last matched string in the map gets stored in ans.

Your code also does not handle the overlapping cases, say for example faster now,I will be faster now. In such cases, you probably want to match I will be faster now for a correct abbreviation.

Below is how I solved it. You can use regular expressions but it seems to be slower on decent enough long strings, because regex gets compiled first before matching.

Snippet:

import java.util.*;
class Solution{
    public static void main(String[] args) {
        Map<String, String> dict = new HashMap<>();
        dict.put("be right back", "BRB");
        dict.put("be right there", "BRT");
        dict.put("be right later", "BRL");
        dict.put("be", "B");
        dict.put("be back soon","B back soon");
        dict.put("faster than light","FTL");
        dict.put("later", "L8R");

        String[] tests = {
            "I will be right there later",
            "I will be right there",
            "I will be there",
            "I will go right there",
            "I will be there later",
            "I am faster than you",
            "Never faster than light",
            "Faster than light today"
        };

        for(String test_case : tests){
            System.out.println(test_case + " => " + convert(test_case, dict));   
        }
    }

    public static String convert(String s, Map<String, String> dict) {

        List<String> dict_words = new ArrayList<>(dict.keySet());
        Map<Integer,String[]> replacement_index = new HashMap<>();

        Collections.sort(dict_words,new Comparator<String>(){
            public int compare(String s1,String s2){
                if(s1.length() == s2.length()) return 0; // order doesn't seem to matter for same length strings
                return s2.length() - s1.length(); // return bigger length string first
            }
        });

        String temp = s.toLowerCase(); // to perform case insensitive match
        for(String dict_str : dict_words){
            String dict_str_lower = dict_str.toLowerCase(); // to perform case insensitive match
            int index = 0;
            do{
                index = temp.indexOf(dict_str_lower,index);
                if(index != -1){
                    replacement_index.putIfAbsent(index,new String[]{dict.get(dict_str),dict_str});
                    index++;// to get the next match index of the same word in the string.
                }
            }while(index != -1 && index < temp.length());
        }

        StringBuilder res = new StringBuilder("");

        for(int i = 0;i < s.length(); ++i){
            if(replacement_index.containsKey(i)){
                res.append(replacement_index.get(i)[0]);
                i += replacement_index.get(i)[1].length() - 1;
            }else{
                res.append(s.charAt(i));
            }
        }

        return res.toString();
    }

}

Demo: https://ideone.com/pIj5dI

Algorithm:

  • In the above code, we first get all map values in a list and sort them in descending order of the length.

  • We do this to avoid overlapping issues as explained above to match larger strings first and then deal with smaller strings.

  • Second is to get all matching indexes of values in the map and store them in another map to get final results.

  • Third, is to loop over the string as is and if we have the current index in iteration in our map(more precisely in replacement_index), then we append the replacement value from our map and move the pointer to a location greater than replaced length.

Note: There is a catch that I am assuming overlapped strings means the smaller one is fully encapsulated inside the larger one. For strings like be right back,right back into for a sentence of I will be right back into this, the abbreviation is undefined from your post. I presume those situations aren't there for your use case.

nice_dev
  • 17,053
  • 2
  • 21
  • 35
  • Thanks for the response! One question is that this solution needs to go through the entire dict_words and sort it. It would cost a lot of time if the dict gets really big. Is it possible to start with the given string and then go to the dict to find the matched abbreviations? – bonus Feb 05 '20 at 19:21
  • @bonus how large can the dict words and string itself get? – nice_dev Feb 05 '20 at 19:30
  • @bonus For large strings, you can refer this answer https://stackoverflow.com/a/40836618/4964822 Also, you didn't mention in your post that strings could be too long. – nice_dev Feb 05 '20 at 20:00
0

Your problem is here: You are only checking the last answer. See my embedded comment below.


        for (String str : toCheck){
         if (dict.containsKey(str)){
                s = s.replace(str, dict.get(str));
                System.out.println(s);
          }
        }

        return s;
WJS
  • 36,363
  • 4
  • 24
  • 39
  • I am actually aware of I'm replacing ans with the new answer. I just don't how to replace all the matching substrings with abbreviations. – bonus Feb 05 '20 at 04:38
  • You can't do it because it will only replace them in the encounter order (order in the map). For example, once you use up **be** none of the phrases that start with be can be matched. Check my updated answer for a partial solution. – WJS Feb 05 '20 at 05:30