4

I am using XStream to save the object of user in a file.

private void store() {
    XStream xStream = new XStream(new DomDriver("UTF-8"));
    xStream.setMode(XStream.XPATH_ABSOLUTE_REFERENCES);

    xStream.alias("configuration", Configuration.class);
    xStream.alias("user", User.class);

    synchronized (ConfigurationDAOImpl.class) {
        try {
            xStream.toXML(configuration, new FileOutputStream(filename.getFile()));
        } catch (IOException e) {
            throw new RuntimeException("Failed to write to " + filename, e);
        }
    }
}

When I am trying to read it by the following code I get an Exception: com.thoughtworks.xstream.io.StreamException: : An invalid XML character (Unicode: 0x1a) was found in the element content of the document.

private void lazyLoad() {
    synchronized (ConfigurationDAOImpl.class) {
        // Has the configuration been loaded
        if (configuration == null) {
            if (filename.exists()) {
                try {
                    XStream xStream = new XStream(new DomDriver("UTF-8"));
                    xStream.setMode(XStream.XPATH_ABSOLUTE_REFERENCES);

                    xStream.alias("configuration", Configuration.class);
                    xStream.alias("user", User.class);

                    configuration = (Configuration) xStream
                            .fromXML(filename.getInputStream());

                    LOGGER.debug("Loaded configuration from {}.", filename);
                } catch (Exception e) {
                    LOGGER.error("Failed to load configuration.", e);
                }
            } else {
                LOGGER.debug("{} does not exist.", filename);
                LOGGER.debug("Creating blank configuration.");

                configuration = new Configuration();
                configuration.setUsers(new ArrayList<User>());

                // and store it
                store();
            }
        }
    }
}

Any idea?

Noushin Khaki
  • 517
  • 2
  • 7
  • 18

3 Answers3

31

0x1a is an invalid xml character. There is no way to represent it in an xml 1.0 document.

Quoted from http://en.wikipedia.org/wiki/XML#Valid_characters

Unicode code points in the following ranges are valid in XML 1.0 documents:[9] U+0009, U+000A, U+000D: these are the only C0 controls accepted in XML 1.0; U+0020–U+D7FF, U+E000–U+FFFD: this excludes some (not all) non-characters in the BMP (all surrogates, U+FFFE and U+FFFF are forbidden); U+10000–U+10FFFF: this includes all code points in supplementary planes, including non-characters.

jontro
  • 10,241
  • 6
  • 46
  • 71
5

I replaced 0x1a with a dash character ('-') by the following method:

/**
 * This method ensures that the output String has only
 * @param in the string that has a non valid character.
 * @return the string that is stripped of the non-valid character
 */
private String stripNonValidXMLCharacters(String in) {      
    if (in == null || ("".equals(in))) return null;
    StringBuffer out = new StringBuffer(in);
    for (int i = 0; i < out.length(); i++) {
        if(out.charAt(i) == 0x1a) {
            out.setCharAt(i, '-');
        }
    }
    return out.toString();
}
Noushin Khaki
  • 517
  • 2
  • 7
  • 18
0

As already pointed out, XML 1.0 accepts only a set of characters according to this.

Here is a helpful java method to ensure that a string is XML 1.0 conformant, it replaces the invalid ones (all of them not just the 0x1a) with a given replacement.

public static String replaceInvalidXMLCharacters(String input, String replacement) {
        StringBuffer result = new StringBuffer();
        char currentChar;

        if (input == null || "".equals(input)) {
            return "";
        }
        for (int i = 0; i < input.length(); i++) {
            currentChar = input.charAt(i);
            if (currentChar == 0x9 || currentChar == 0xA || currentChar == 0xD || currentChar >= 0x20 && currentChar <= 0xD7FF || currentChar >= 0xE000
                    && currentChar <= 0xFFFD || currentChar >= 0x10000 && currentChar <= 0x10FFFF) {
                result.append(currentChar);
            } else {
                result.append(replacement);
            }
        }
        return result.toString();
    }
Bakri Bitar
  • 1,543
  • 18
  • 29