0

I have the following program that I would like to replace all occurrences of a string where a word exists as a key in a map with its corresponding value.

I have implemented 4 methods. Each of them perform roughly the same function but in a different way. The output for the first 3 are incorrect as the next replacement overrides the result of the previous. The fourth works, but only because I am replacing single characters in the whole string. This is very inefficient anyways, because I am only checking a substring of the entire string.

Is there a way to safely replace all occurrences without overwriting the previous replacements?

I noticed that Apache has a StringUtils.replaceEach() method, but I would prefer to use a map.

Output:

Apple BApplenApplenApple CApplentApplelope DApplete Apple BApplenApplenApple CApplentApplelope DApplete
Apple BApplenApplenApple CApplentApplelope DApplete Apple BApplenApplenApple CApplentApplelope DApplete
Apple BApplenApplenApple CApplentApplelope DApplete Apple BApplenApplenApple CApplentApplelope DApplete
Apple Banana Cantalope Date Apple Banana Cantalope Date

ReplaceMap.java

import java.util.Collection;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class ReplaceMap {
    private static Map<String, String> replacements;

    static {
        replacements = new HashMap<String, String>();
        replacements.put("a", "Apple");
        replacements.put("b", "Banana");
        replacements.put("c", "Cantalope");
        replacements.put("d", "Date");
    }

    public ReplaceMap() {
        String phrase = "a b c d a b c d";

        System.out.println(mapReplaceAll1(phrase, replacements));
        System.out.println(mapReplaceAll2(phrase, replacements));
        System.out.println(mapReplaceAll3(phrase, replacements));
        System.out.println(mapReplaceAll4(phrase, replacements));
    }

    public String mapReplaceAll1(String str, Map<String, String> replacements) {
        for (Map.Entry<String, String> entry : replacements.entrySet()) {
            str = str.replaceAll(entry.getKey(), entry.getValue());
        }

        return str;
    }

    public String mapReplaceAll2(String str, Map<String, String> replacements) {
        for (String key : replacements.keySet()) {
            str = str.replaceAll(Pattern.quote(key),
                    Matcher.quoteReplacement(replacements.get(key)));
        }

        return str;
    }

    public String mapReplaceAll3(String str, Map<String, String> replacements) {        
        String regex = new StringBuilder("(")
            .append(join(replacements.keySet(), "|")).append(")").toString();
        Matcher matcher = Pattern.compile(regex).matcher(str);

        while (matcher.find()) {
            str = str.replaceAll(Pattern.quote(matcher.group(1)),
                    Matcher.quoteReplacement(replacements.get(matcher.group(1))));
        }

        return str;
    }

    public String mapReplaceAll4(String str, Map<String, String> replacements) {        
        StringBuilder buffer = new StringBuilder();
        String regex = new StringBuilder("(")
            .append(join(replacements.keySet(), "|")).append(")").toString();
        Pattern pattern = Pattern.compile(regex);

        for (int i = 0, j = 1; i < str.length(); i++, j++) {
            String s = str.substring(i, j);
            Matcher matcher = pattern.matcher(s);


            if (matcher.find()) {
                buffer.append(s.replaceAll(Pattern.quote(matcher.group(1)),
                            Matcher.quoteReplacement(replacements.get(matcher.group(1)))));
            } else {
                buffer.append(s);
            }
        }


        return buffer.toString();
    }

    public static String join(Collection<String> s, String delimiter) {
        StringBuilder buffer = new StringBuilder();
        Iterator<String> iter = s.iterator();
        while (iter.hasNext()) {
            buffer.append(iter.next());
            if (iter.hasNext()) {
                buffer.append(delimiter);
            }
        }
        return buffer.toString();
    }

    public static void main(String[] args) {
        new ReplaceMap();
    }
}
nhahtdh
  • 55,989
  • 15
  • 126
  • 162
Mr. Polywhirl
  • 42,981
  • 12
  • 84
  • 132
  • I've got some really usefull answers here : http://stackoverflow.com/questions/26735276/alternative-to-successive-string-replace – TheByeByeMan Jan 14 '15 at 11:42
  • Does this answer your question? [Java Replacing multiple different substring in a string at once (or in the most efficient way)](https://stackoverflow.com/questions/1326682/java-replacing-multiple-different-substring-in-a-string-at-once-or-in-the-most) – Dave Jarvis Jan 06 '21 at 04:31

3 Answers3

1

My approach would be the following. There are probably faster solutions, but you can take it a step further if you like the idea.

public String mapReplaceAll5(String str, Map<String, String> replacements) {
    Map<String, String> origToMarker = new HashMap<String, String>();
    Map<String, String> markerToRepl = new HashMap<String, String>();
    char c = 32000;
    for(Entry<String, String> e : replacements.entrySet()) {
        origToMarker.put(e.getKey(), String.valueOf(c));
        markerToRepl.put(String.valueOf(c--), e.getValue());
    }
    for (Map.Entry<String, String> entry : origToMarker.entrySet()) {
        str = str.replaceAll(entry.getKey(), entry.getValue());
    }
    for (Map.Entry<String, String> entry : markerToRepl.entrySet()) {
        str = str.replaceAll(entry.getKey(), entry.getValue());
    }

    return str;
}
Laszlo B
  • 455
  • 3
  • 14
1

I'd do it thus:

replace(str, map)
    if we have the empty string, the result is the empty string.
    if the string starts with one of the keys from the map:
        the result is the replacement associated with that key + replace(str', map)
             where str' is the substring of str after the key
    otherwise the result is the first character of str + replace(str', map)
             where str' is the substring of str without the first character

Note that, though formulated recursively, it can (and should, due to Javas notorious small stack space) be implemented as a loop and write the first part of the result (i.e. the replacement string or the first character) to a stringbuilder.

If you have a key in the map that is a prefix of some other key (i.e. "key", "keys") you may need to try the keys in decreasing length.

Note further, that a faster algorithm could be devised that uses Tries instead of HasMaps. This would also be a remedy for the ambiguous key problem.

Here is an outline (not tested):

public static String replace(String it, Map<String, String> map) {
    StringBuilder sb = new StringBuilder();
    List<String> keys = map.keySet();      // TODO: sort by decreasing length!!
    next: while (it.length() > 0) {
        for (String k : keys) {
            if (it.startsWith(k)) {
                // we have a match!
                sb.append(map.get(k));
                it = it.substring(k.length(), it.length());
                continue next;
            }
        }
        // no match, advance one character
        sb.append(it.charAt(0));
        it = it.substring(1, it.length());
    }
    return sb.toString();
}
Ingo
  • 36,037
  • 5
  • 53
  • 100
  • So I guess, that I would get the index of the match and set the substring equal to the `foundIndex + matchLength` to the end of the string? – Mr. Polywhirl Dec 03 '13 at 10:15
  • @Mr.Polywhirl If you have "key" assiciated with "clef" and the string is "keys of the house", then the result would be "clef" + replace("s of the house", map). The `foundIndex` is always 0 (use `String#startsWith`) – Ingo Dec 03 '13 at 10:19
0

You can use StringUtils.replaceEach with your map, at the expense of copying the data into a pair of arrays.

public String replaceEach(String s, Map<String, String> replacements)
{
    int size = replacements.size();
    String[] keys = replacements.keySet().toArray(new String[size]);
    String[] values = replacements.values().toArray(new String[size]);
    return StringUtils.replaceEach(s, keys, values);
}

Recommend using LinkedHashMap, so that the iteration order is well-defined, but I suspect this will work just fine with a HashMap.

MikeFHay
  • 8,562
  • 4
  • 31
  • 52