-1

I have a String in Java, this string represents the content of a XML file (That I'm generating in other process), I have a problem with the codification, in the header of the XML I have UTF-8 but when I tried to parse it I gets an error related with the codification, exactly:

Byte not valid 2 pf the sequence UTF-8 of 4 bytes

So, I opened the file with Notepad++ and it says it's with ANSI codification. I was thinking in convert the String to UTF-8 before save it in the file, I made this with:

byte[] encoded = content.getBytes(StandardCharsets.UTF_8);

But then,how I save it in the file?I want the user be able to open the XML file in any text editor, but now I have bytes.How I save it?

Sredny M Casanova
  • 4,735
  • 21
  • 70
  • 115

2 Answers2

0

The following should do

// Ensure that the stated encoding in the XML is UTF-8:
//                              $1______________________ $2_____ $3_
content = content.replaceFirst("(<\\?xml[^>]+encoding=\")([^\"]*)(\")",
        "$1UTF-8$3");

byte[] encoded = content.getBytes(StandardCharsets.UTF_8);

Files.writeBytes(Paths.get("... .xml"), encoded);

For editing one needs a UTF-8 capable editor (JEdit, Notepad++) - under Windows.

Notepad++ should recognize the file, you could reload it with the right encoding.

Joop Eggen
  • 107,315
  • 7
  • 83
  • 138
0

Try Files.write(Paths.get("output.xml"), encoded);.

F Lopez
  • 1
  • 1