1

I am attempting to write a little class to escape characters in an XML document. I am using xpath to get the nodes of the XML document, and passing each node to my class. However, it is not working. I want to change:

"I would like a burger & fries."

to

"I would like a burger & fries."

Here is the code for my class:

import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;

public class MyReplace{
    private static final HashMap<String,String> xmlCharactersToBeEscaped;
    private Iterator iterator;
    private String newNode;
    private String mapKey;
    private String mapValue;

    static {
        xmlCharactersToBeEscaped = new HashMap<String,String>();
        xmlCharactersToBeEscaped.put("\"","&quot;");
        xmlCharactersToBeEscaped.put("'","&apos;");
        xmlCharactersToBeEscaped.put("<","&lt;");
        xmlCharactersToBeEscaped.put(">","&gt;");
        xmlCharactersToBeEscaped.put("&","&amp;");
    }

    public String replaceSpecialChar(String node){
        if(node != null){
            newNode = node;
            iterator = xmlCharactersToBeEscaped.entrySet().iterator();
            while(iterator.hasNext()){
                Map.Entry mapEntry = (Map.Entry) iterator.next();
                mapKey = mapEntry.getKey().toString();
                mapValue = mapEntry.getValue().toString();

                if(newNode.contains(mapKey)){
                    newNode = newNode.replace(mapKey,mapValue);
                }
            }
            return newNode;
        } else {
            return node;
        }
    }
}

What is happening is that it is replacing the first special character in the map, the quote, and skipping everything else.

Brian
  • 1,726
  • 2
  • 24
  • 62
  • Geez, that's the most needlessly complicated way of making string replacements I've ever seen in my life. Surely you've heard of [`String.replace`](http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#replace(char,%20char))? Or just use a built in XML library. They do all this for you anyway. http://stackoverflow.com/questions/439298/best-way-to-encode-text-data-for-xml-in-java – tnw May 05 '15 at 18:49
  • @tnw OP performs replacements on String instances, but, I agree, access is overcomplicated. – Alex Salauyou May 05 '15 at 18:51
  • That is what I am using. What I want to do is loop through a set of 5 different "strings" and replace them all. Plus, I would like this in it's own class. – Brian May 05 '15 at 18:52
  • 1
    Can you show your input test file and output ? – Sendi_t May 05 '15 at 18:53
  • The point they were making is that you don't need to check if the string contains the thing to replace, replace simply returns the original string if it's not there. Also, `mapKey = mapEntry.getKey().toString();` is redundant, `toString()` return itself on a string, `mapKey = mapEntry.getKey();` will work. – Captain Man May 05 '15 at 18:55
  • You shouldn't declare local variables (like `newNode` etc.) as a field. Keep them inside the method. – Bubletan May 05 '15 at 18:56

2 Answers2

4

Your solution is over complicated.

Use StringEscapeUtils (Part of the Commons Lang library). This has a built in feature to escape and unescape XML, HTML and much more. Commons lang is very easy to import and the following examples are from the latest stable release (3.4). Previous versions use different methods, look up the Java doc dependant on your version. It's very flexible so you can do a lot more with it than just simple escapes and unescapes.

String convertedString = StringEscapeUtils.escapeXml11(inputString);

If you're using XML 1.0 they also offer the following

String convertedString10 = StringEscapeUtils.escapeXml10(inputString);

Get it here: https://commons.apache.org/proper/commons-lang/

Java docs here (3.4): https://commons.apache.org/proper/commons-lang/javadocs/api-3.4/org/apache/commons/lang3/StringEscapeUtils.html

Daniel Tung
  • 427
  • 2
  • 14
  • I didn't know about StringEscapeUtils ... very cool, thank you. – Brian May 05 '15 at 18:59
  • 1
    That's good! I also prefer Apache Commons for such stuff. Now my upvote is yours – Alex Salauyou May 05 '15 at 18:59
  • If you like this answer and it works out please mark it as a solution. Best of luck Brian. Common libraries from Apache are pretty darn useful! If you want to write the same yourself it's all open source and you can see how they did it here: https://commons.apache.org/proper/commons-lang/javadocs/api-3.4/src-html/org/apache/commons/lang3/StringEscapeUtils.html#line.41 – Daniel Tung May 05 '15 at 19:02
2

Make it simpler (and see comment below):

xmlCharactersToBeEscaped = new HashMap<String,String>();
xmlCharactersToBeEscaped.put("\"","&quot;");
xmlCharactersToBeEscaped.put("'","&apos;");
xmlCharactersToBeEscaped.put("<","&lt;");
xmlCharactersToBeEscaped.put(">","&gt;");
/* xmlCharactersToBeEscaped.put("&","&amp;"); <-- don't add this to the map */

//...
public String replaceSpecialChars(String node) {
    if (node != null) {
        String newNode = node.replace("&", "&amp;"); 
        for (Map.Entry<String, String> e : xmlCharactersToBeEscaped.entrySet()) {              
             newNode = newNode.replace(e.getKey(), e.getValue());
        }
        return newNode;
    } else {
        return null;
    }
}

or better use StringEscapeUtils for such stuff.

Alex Salauyou
  • 14,185
  • 5
  • 45
  • 67
  • Note that if `&` is not the first replacement, when it comes time to replace `&` all previous `&xxx`will have their `&` replaced. – copeg May 05 '15 at 19:02