1

I was parsing an XML that has an image that is Base64 encoded. I would like to extract the image and parse the remainder of the XML. The code I have written to extract the image is as below:

private void saveFormImage(String imageText) throws IOException {
    FileOutputStream  fos = null;
    try {
        Base64 base64=new Base64();
        byte decoded[]=base64.decode(imageText.getBytes());
        File file = new File(<file loc>);
        fos = new FileOutputStream(file);
        fos.write(decoded);
    } finally {
        IOUtils.closeQuietly(fos);
    }
}

I use JDOM to parse the XML and obtain the imageText first as a String and pass the string to this method. I then use the Apache codec library to decode the Base64 encoded data and store into a file.

Is this the best way to do this? This is not awfully fast. It finishes in about 2s. Is there a faster and memory efficient way of doing this?

As updated in a comment below - Is there a way to pipe the data from the XML directly onto an OutputStream and just decoding a buffer in memory? Is this a more memory efficient way of doing things? Or should this matter when the XML size would be max 2.5 MB.

sethu
  • 8,181
  • 7
  • 39
  • 65
  • Are you asking if the code is as efficient as can be? If the Apache codec library is the best option to Base64 Decode or if the entire solution of serializing an image in Base64 encoding within an XML is a good solution? Which of these are you able to actually change? – RonK Mar 11 '12 at 12:17
  • The Apache codec library.. I dont have an option regarding the latter. The XML contains the image already and I just need to get it. I was wondering more from, is there a way to pipe the data from the XML onto an Outputstream, without having to hold the entire data as a String on the heap? The XML size is about 1.5MB. Is this a size I should be worrying about? – sethu Mar 11 '12 at 12:42
  • http://stackoverflow.com/a/1687218/14419 – Mads Hansen Mar 11 '12 at 12:53
  • Dont thin XSLT is the solution here.. I need data from other elements as well. – sethu Mar 11 '12 at 13:55
  • What was the solution to the problem? Did you end up finding an answer? What was the import you use for Base64 class? – Whitecat Jan 31 '13 at 00:44
  • Hey Whitecat.. I went with the above pasted code itself. Turns out the spike in the memory is momentary since all our local variables it does get garbage collected swiftly. The import is org.apache.commons.codec.binary.Base64 – sethu Feb 01 '13 at 07:48

1 Answers1

2

What about the rest of the XML of the document? Do you want discard it?

If yes, then have a look at STAX (Streaming API for XML):

http://docs.oracle.com/javase/7/docs/api/javax/xml/stream/package-summary.html

It's part of Java SE 6.

If you want to parse the rest of the document as well, consider JAXB with custom bindings.

Puce
  • 37,247
  • 13
  • 80
  • 152