-2

My code utilizes XMLwriter and XMLObjectOutputStream to process with string.

The key point is the string may contain character zero (\0). If I use string.replace("\0", "") there is no error reported. If do not use string.replace("\0", ""), it will report "character zero is not allowed in output".

However, I need to keep the original string so can use it next to output into a txt file. In this case, how to deal with the string to keep the \0 and also work with XMLwriter?

The string looks like:

 NULNULSTX &The story of Florida 

(open in Notpad++ shows NUL, open in txt just show space in NUL)

Here is the error info:

ERROR (SelectorManager.run):  java.lang.IllegalStateException: character zero is not allowed in output
at org.xmlpull.mxp1_serializer.MXSerializer.writeAttributeValue(MXSerializer.java:849)
at org.xmlpull.mxp1_serializer.MXSerializer.attribute(MXSerializer.java:624)

Also, if the string cannot escape from null terminate, can we find another way to replace string (e.g. char, array...) to output the original info?

Andrew
  • 1
  • 3
  • Do you have the same problem with `<` and `>` in the text? Then your problem is escaping. You must escape strings to be compatible with xml. – CoronA Jan 09 '19 at 03:58
  • I do not have the problem with < or >, only with null character which makes me confusing. Now it seems need to find a way to keep the null character in string, or use other formats to store the context? – Andrew Jan 09 '19 at 04:25
  • Maybe this explains why: https://stackoverflow.com/questions/730133/invalid-characters-in-xml. Have you tried putting it into a CDATA section. Please post a simple code example that fails. It is easier to check the validity of answers if one could test it directly on the code you use. – CoronA Jan 09 '19 at 05:21

2 Answers2

1

Rather than replacing the "\0" with an empty string, try escaping it with an escape character:

string.replace("\0", "\\0")
Josh Lambert
  • 115
  • 1
  • 5
  • After output the string into a txt file, it shows \0 in the string which different with the original (null). The key is need to keep the sting same as before... – Andrew Jan 09 '19 at 03:31
  • As @CoronA commented, it would be easier to help with an example of what string is failing and what you're expecting the output to be for that string. – Josh Lambert Jan 09 '19 at 15:29
  • I edit the original text, please see the details. Thanks – Andrew Jan 09 '19 at 16:14
  • If you just need it to be human readable why not replace the '\0' with 'NUL'? You lose the actual character but the text will be the same across any editor. – Josh Lambert Jan 09 '19 at 23:13
  • The key point is the string needs to be kept as exactly as before, and it will be used in the next to do some operations....so I cannot just replace \0 with nul... – Andrew Jan 11 '19 at 18:11
0

I find the solution, it seems we can utilize StringEscapeUtils in java.commons.lang3.

StringEscapeUtils.escapeJava(javaString)
StringEscapeUtils.unescapeJava(StringEscapeUtils.escapeJava(javaString)

that can encode the string without violating the XML requirements.

Andrew
  • 1
  • 3