Replacing abbreviations/slangs with their fullforms

Question

I am using a HashMap to store the full forms for abbreviations.

public class Test {
    public static void main(String[] args) {
        Map<String, String> slangs = new HashMap<String, String>();
        slangs.put("lol", "laugh out loud");
        slangs.put("r", " are ");
        slangs.put("n", " and ");
        slangs.put("idk", " I don't know ");
        slangs.put("u", " you ");
        Set set = slangs.entrySet();
        Iterator i = set.iterator();

        String sentence = "lol how are you";
        StringBuilder sb = new StringBuilder();

        for (String word : sentence.split(" ")) {
            while(i.hasNext()) {
                Map.Entry<String, String> me = (Map.Entry)i.next();
                if (word.equalsIgnoreCase(me.getKey())) {
                    sb.append(me.getValue());
                    continue;
                }
                sb.append(word);
            }
        }
        System.out.println(sb.toString());
    }
}

The Output is:

lollollollaugh out loudlol

What is wrong here and how do I solve it?

Idk why you make it quite complicated. In my first thought, just split sentence into words, then check if `slangs` contain it or not. — Manh Le, Jan 16 '17 at 10:32
Try to print all results, I am not sure why are you using that while(i.hasNext()) in the first place, your sentence won't work, because you are getting word by word, that means for example in the case of lol, in your sentence you have the slang and not the phrase itself. I believe that you should check some solutions on internet and get some base upon that. — Galeixo, Jan 16 '17 at 10:36

Nicolas Filotto · Accepted Answer · 2017-01-16T11:02:47.040

You are not supposed to iterate over the entries to find a match, you are supposed to use get(Object key) or getOrDefault(Object key, V defaultValue) to get the full form of a given abbreviation, otherwise instead of getting your full form with a time complexity of O(1), you will get it with a O(n) which is of course not good in term of performances, you would lose the real benefit of having your key/value pairs in a Map. If you did it because of the case, simply put your keys only in lower case in your map and call get or getOrDefault with the word in lower case as below:

So your loop should be something like:

for (String word : sentence.split(" ")) {
    // Get the full form of the value of word in lower case otherwise use
    // the word itself
    sb.append(slangs.getOrDefault(word.toLowerCase(), String.format(" %s", word)));
}

Output:

laugh out loud how are you

Using the Stream API, it could simply be:

String result = Pattern.compile(" ")
    .splitAsStream(sentence)
    .map(word -> slangs.getOrDefault(word.toLowerCase(), word))
    .collect(Collectors.joining(" "));

tobias_k · Answer 2 · 2017-01-16T10:52:30.230

3

Don't loop over the keys in the dictionary. Instead, just check whether the key is in the map and get the corresponding value. Also, don't forget to add the spaces back into the combined sentence.

for (String word : sentence.split(" ")) {
    if (slangs.containsKey(word.toLowerCase())) {
        sb.append(slangs.get(word.toLowerCase()));
    } else {
        sb.append(word);
    }
    sb.append(" ");
}

If you are using Java 8, you can also use String.join, Map.getOrDefault and Streams:

String s = String.join(" ", Stream.of(sentence.split(" "))
        .map(word -> slangs.getOrDefault(word.toLowerCase(), word))
        .toArray(n -> new String[n]));

This latter approach also has the benefit of not adding a space before the first or after the last word in the sentence.

edited Jan 16 '17 at 10:52

answered Jan 16 '17 at 10:46

tobias_k

81,265
12
120
179

You should be using `word.toLowerCase()` for lookup. Also, you can use `String[]::new` instead of `n -> new String[n]`. – shmosel Jan 16 '17 at 10:48
@shmosel Agreed in both points. Strangely, `String[]::new` gave me weird compilation problems in Eclipse, though, so can't test and only changed the first one. – tobias_k Jan 16 '17 at 10:53

Manh Le · Answer 3 · 2017-01-16T15:30:59.330

0

Simply, I think you just need to check if slangs contain this keyword or not. Please check my code.

 public class Test {
    public static void main(String[] args) {

      Map<String, String> slangs = new HashMap<String, String>();
      slangs.put("lol", "laugh out loud");
      slangs.put("r", " are ");
      slangs.put("n", " and ");
      slangs.put("idk", " I don't know ");
      slangs.put("u", " you ");

      String sentence = "lol how are you";
      String[] words = sentence.split(" ");

      for (String word : words) {
        String normalizeWord = word.trim().toLowerCase();
        if(slangs.containsKey(normalizeWord)) {
            sentence = sentence.replace(word, slangs.get(normalizeWord));
        }
    }
    System.out.println(sentence);
  }
}

edited Jan 16 '17 at 15:30

answered Jan 16 '17 at 10:42

Manh Le

1,630
16
26

1

`containsKey()` is redundant if you're going to call `get()`. And `replaceAll()` may fail on regex special characters. And it seems a little backwards to iterate over the words instead of the entries if you're going to use `replaceAll()`. – shmosel Jan 16 '17 at 10:45
@shmosel - I actually find the use of containsKey + get slightly easier to read than a straight get-to-var, test var for null, and then use. Yes, slightly less efficient, but should not be noticeable. I agree on iteration order; and that is a *much* more noticeable inefficiency – tucuxi Jan 16 '17 at 10:54
@shmosel thank you for your comment. Agree that because run loop through sentence, we should use `replace`. Will update my answer. I think if don't use `containsKey()`, we need to use some temp variable like StringBuilder to get correct answer. – Manh Le Jan 16 '17 at 15:30

Replacing abbreviations/slangs with their fullforms

3 Answers3