9

Given a string like so:

 Hello {FIRST_NAME}, this is a personalized message for you.

Where FIRST_NAME is an arbitrary token (a key in a map passed to the method), to write a routine which would turn that string into:

Hello Jim, this is a personalized message for you.

given a map with an entry FIRST_NAME -> Jim.

It would seem that StringTokenizer is the most straight forward approach, but the Javadocs really say you should prefer to use the regex aproach. How would you do that in a regex based solution?

Yishai
  • 90,445
  • 31
  • 189
  • 263

10 Answers10

11

Thanks everyone for the answers!

Gizmo's answer was definitely out of the box, and a great solution, but unfortunately not appropriate as the format can't be limited to what the Formatter class does in this case.

Adam Paynter really got to the heart of the matter, with the right pattern.

Peter Nix and Sean Bright had a great workaround to avoid all of the complexities of the regex, but I needed to raise some errors if there were bad tokens, which that didn't do.

But in terms of both doing a regex and a reasonable replace loop, this is the answer I came up with (with a little help from Google and the existing answer, including Sean Bright's comment about how to use group(1) vs group()):

private static Pattern tokenPattern = Pattern.compile("\\{([^}]*)\\}");

public static String process(String template, Map<String, Object> params) {
    StringBuffer sb = new StringBuffer();
    Matcher myMatcher = tokenPattern.matcher(template);
    while (myMatcher.find()) {
        String field = myMatcher.group(1);
        myMatcher.appendReplacement(sb, "");
        sb.append(doParameter(field, params));
   }
    myMatcher.appendTail(sb);
    return sb.toString();
}

Where doParameter gets the value out of the map and converts it to a string and throws an exception if it isn't there.

Note also I changed the pattern to find empty braces (i.e. {}), as that is an error condition explicitly checked for.

EDIT: Note that appendReplacement is not agnostic about the content of the string. Per the javadocs, it recognizes $ and backslash as a special character, so I added some escaping to handle that to the sample above. Not done in the most performance conscious way, but in my case it isn't a big enough deal to be worth attempting to micro-optimize the string creations.

Thanks to the comment from Alan M, this can be made even simpler to avoid the special character issues of appendReplacement.

Yishai
  • 90,445
  • 31
  • 189
  • 263
8

Well, I would rather use String.format(), or better MessageFormat.

gizmo
  • 11,819
  • 6
  • 44
  • 61
6
String.replaceAll("{FIRST_NAME}", actualName);

Check out the javadocs for it here.

jjnguy
  • 136,852
  • 53
  • 295
  • 323
  • The performance of that will be o(n*k), where n is the size of the input string, and k the number of keys. – Daniel C. Sobral Jul 16 '09 at 17:11
  • @Daniel Did you read the source code to come to that conclusion? Java does some pretty intelligent things with strings. I'd expect that there is a very good chance it will outperform any other solution you could come up with. – Bill K Jul 16 '09 at 21:32
  • @BillK I think he might have meant that you'd have to call `replaceAll` repeatedly if you have more than one key to replace in the string, hence `*k`. – Svish Dec 10 '13 at 10:57
  • I guess I'm saying K is probably not as big as you'd think. I've yet to see a case where this performance makes a difference--I've worked on embedded java systems with almost nothing and even tried optimizing ALL string manipulations out because the java lore is that it makes a huge difference--it doesn't. The kind of string optimizations you'd have to make to actually improve over the JVM are a lot more in-depth. Besides, readability before performance, this says EXACTLY what you want done, you'd have to demonstrate that it fixes noticable problem before replacing something that readable. – Bill K Dec 12 '13 at 01:05
4

Try this:

Note: The author's final solution builds upon this sample and is much more concise.

public class TokenReplacer {

    private Pattern tokenPattern;

    public TokenReplacer() {
        tokenPattern = Pattern.compile("\\{([^}]+)\\}");
    }

    public String replaceTokens(String text, Map<String, String> valuesByKey) {
        StringBuilder output = new StringBuilder();
        Matcher tokenMatcher = tokenPattern.matcher(text);

        int cursor = 0;
        while (tokenMatcher.find()) {
            // A token is defined as a sequence of the format "{...}".
            // A key is defined as the content between the brackets.
            int tokenStart = tokenMatcher.start();
            int tokenEnd = tokenMatcher.end();
            int keyStart = tokenMatcher.start(1);
            int keyEnd = tokenMatcher.end(1);

            output.append(text.substring(cursor, tokenStart));

            String token = text.substring(tokenStart, tokenEnd);
            String key = text.substring(keyStart, keyEnd);

            if (valuesByKey.containsKey(key)) {
                String value = valuesByKey.get(key);
                output.append(value);
            } else {
                output.append(token);
            }

            cursor = tokenEnd;
        }
        output.append(text.substring(cursor));

        return output.toString();
    }

}
Community
  • 1
  • 1
Adam Paynter
  • 46,244
  • 33
  • 149
  • 164
  • That will recompile the pattern for each line. I prefer my patterns as pre-compiled as possible! :-) Also, you'd better check for the existance of the token. – Daniel C. Sobral Jul 16 '09 at 17:08
  • I mean, check that the tokenexists in the map. – Daniel C. Sobral Jul 16 '09 at 17:11
  • You can just make the `tokenPattern` an instance variable of whatever class will contain this method to avoid compiling it each time. The code will automatically accommodate the situation whereby no token is detected (`output.append(text.substring(cursor))`). – Adam Paynter Jul 16 '09 at 17:11
  • The latest change accommodates checking for the existence of the key. – Adam Paynter Jul 16 '09 at 17:14
  • This is a fine answer and deserves the rep bump of being the accepted answer. See my answer for a cleaner way to do the appending once you found the token. – Yishai Jul 16 '09 at 18:11
  • @Yishai I would really recommend accepting your own answer in two days. Although it is nice of you to recognize his answer as being good, it may be more helpful to future readers of your question to see the actual way that you solved your problem. – jjnguy Jul 16 '09 at 22:02
3

With import java.util.regex.*:

Pattern p = Pattern.compile("{([^{}]*)}");
Matcher m = p.matcher(line);  // line being "Hello, {FIRST_NAME}..."
while (m.find) {
  String key = m.group(1);
  if (map.containsKey(key)) {
    String value= map.get(key);
    m.replaceFirst(value);
  }
}

So, the regex is recommended because it can easily identify the places that require substitution in the string, as well as extracting the name of the key for substitution. It's much more efficient than breaking the whole string.

You'll probably want to loop with the Matcher line inside and the Pattern line outside, so you can replace all lines. The pattern never needs to be recompiled, and it's more efficient to avoid doing so unnecessarily.

Daniel C. Sobral
  • 295,120
  • 86
  • 501
  • 681
2

The most straight forward would seem to be something along the lines of this:

public static void main(String[] args) {
    String tokenString = "Hello {FIRST_NAME}, this is a personalized message for you.";
    Map<String, String> tokenMap = new HashMap<String, String>();
    tokenMap.put("{FIRST_NAME}", "Jim");
    String transformedString = tokenString;
    for (String token : tokenMap.keySet()) {
        transformedString = transformedString.replace(token, tokenMap.get(token));
    }
    System.out.println("New String: " + transformedString);
}

It loops through all your tokens and replaces every token with what you need, and uses the standard String method for replacement, thus skipping the whole RegEx frustrations.

Peter
  • 8,545
  • 1
  • 27
  • 24
  • 2
    That would mean reading the whole string for each token. If you have k tokens and n bytes to process, then the algorithm will have order o(n*k). Very inefficient. – Daniel C. Sobral Jul 16 '09 at 17:05
  • 1
    Theoretically, it is o(n*k) as stated, but your statement feels like a premature optimization to me. Without knowing more about how often this algorithm is called, how many tokens are present in the string, how long the string is, and how critical time saving is, its impossible to say how big an impact the inefficiency is. If this is only called once with a total run time of 10 ms even though it could be as efficient at 1 ms (for example) certainly its an order of magnitude slower than it could be, but is the performance penalty really that substantial in the grand scheme of things? – Peter Jul 16 '09 at 17:39
2

Depending on how ridiculously complex your string is, you could try using a more serious string templating language, like Velocity. In Velocity's case, you'd do something like this:

Velocity.init();
VelocityContext context = new VelocityContext();
context.put( "name", "Bob" );
StringWriter output = new StringWriter();
Velocity.evaluate( context, output, "", 
      "Hello, #name, this is a personalized message for you.");
System.out.println(output.toString());

But that is likely overkill if you only want to replace one or two values.

Brandon Yarbrough
  • 37,021
  • 23
  • 116
  • 145
1
import java.util.HashMap;

public class ReplaceTest {

  public static void main(String[] args) {
    HashMap<String, String> map = new HashMap<String, String>();

    map.put("FIRST_NAME", "Jim");
    map.put("LAST_NAME",  "Johnson");
    map.put("PHONE",      "410-555-1212");

    String s = "Hello {FIRST_NAME} {LAST_NAME}, this is a personalized message for you.";

    for (String key : map.keySet()) {
      s = s.replaceAll("\\{" + key + "\\}", map.get(key));
    }

    System.out.println(s);
  }

}
Sean Bright
  • 118,630
  • 17
  • 138
  • 146
0

The docs mean that you should prefer writing a regex-based tokenizer, IIRC. What might work better for you is a standard regex search-replace.

Draemon
  • 33,955
  • 16
  • 77
  • 104
0

Generally we'd use MessageFormat in a case like this, coupled with loading the actual message text from a ResourceBundle. This gives you the added benefit of being G10N friendly.