2

I am currently using a ByteArrayOutputStream to convert BufferedImage to byte[] and then the open source class Base64Coder to convert the byte[] to char[] and then appending to a String. This is one part of a multi-step process for encoding frame sets of videos and putting them in XML friendly format. Don't ask why I am doing this, that is just what needs to be done.

I am seeing that the Base64 encoding takes up ~75% CPU time of the entire process and seeing as I just grabbed this random class off a google search, I'm certain there is something more efficient out there to encode the images. What are my options guys?

Marty
  • 2,104
  • 2
  • 23
  • 42
  • 1
    More efficient than what? Give the algorithm or we can't tell you what you can improve. – Thomas Jungblut Nov 30 '11 at 18:40
  • Look here: http://stackoverflow.com/questions/469695/decode-base64-data-in-java Somewhat similar... – Matjaz Muhic Nov 30 '11 at 18:43
  • I already told you the class, Base64Coder, it is the first result in a google search... – Marty Nov 30 '11 at 18:48
  • 1
    I am finding that all of the classes I've tested are on par with each other in terms of speed. Running on a slow VM, the Apache commons Base64 class, the Base64Coder class, and the MiGBase64 class all converted 300 ~15kb captured jpeg frames to a Base64 String and output them to XML in 69 seconds. Guess that's as good as it gets. Thanks all! – Marty Nov 30 '11 at 19:50

3 Answers3

3

This is quite an old question, but it still turned up in Google as one of the top hits…

This has been answered comprehensively here: http://java-performance.info/base64-encoding-and-decoding-performance/

Taking the summary from there:

Let's summarize the codec properties in one table. This table is sorted by the relative performance of all these codecs (faster on top).

Name        Max encoding    Max decoding    How much we can Supports    byte[] -> byte[]
            len             len             encode with -Xmx8G
Java 8      1.62 G          2 G             1.16 G                      Yes
javax.xml   1.62 G          2 G             1.07 G                      No
MiGBase64   1.62 G          0.36 G          1.07 G                      Yes
IHarder     1.62 G          0.72 G          1.23 G                      Yes
Apache      0.81 G          0.72 G          0.8 G                       Yes
Guava       1.62 G          2 G             1.07 G                      No
Sun.misc    0.79 G          1.05 G          0.78 G                      No

If you looking for a fast and reliable Base64 codec - do not look outside JDK. There is a new codec in Java 8: java.util.Base64 and there is also one hidden from many eyes (from Java 6): javax.xml.bind.DatatypeConverter. Both are fast, reliable and do not suffer from integer overflows.

2 out of 4 3rd party codecs described here are very fast: MiGBase64 and IHarder. Unfortunately, if you will need to process hundreds of megabytes at a time, only Google Guava will allow you to decode 2G of data at a time (360MB in case of MiGBase64 / 720M in case of IHarder and Apache Commons). Unfortunately, Guava does not support byte[] -> byte[] encoding.

Do not try to call String.getBytes(Charset) on huge strings if your charset is a multibyte one - you may get the whole gamma of integer overflow related exceptions.

Paul Wagland
  • 27,756
  • 10
  • 52
  • 74
0

The problem with @PaulWagland solution is that almost all the encoders will do the allocation for the encoded byte array (or variants) for you. That is they are not garbage free.

I don't really recommend the following unless you know what you are doing.

Ideally what you want to do is have a massive byte[] set to the maximum size you expect and then reuse this byte[] either with a threadlocal or some sort of pooling.

Unfortunately Base64.java has the method you want hidden:

private int decode0(byte[] src, int sp, int sl, byte[] dst) {
...
}

(I'm not going to paste the code from the JDK but I'm sure you can easily find it).

Thus if you really want to go fast you would use that method on cached byte[] arrays.

Ideally though you would want to rewrite it to use ByteBuffers.

Alternatively as a stop gap you could use Base64.Decode#wrap but the problem with that method is that it will create a wrapping InputStream which is probably better than allocating new arrays but still not garbage free. You will also need to wrap your ByteBuffer/byte[] array in its own InputStream.

IMO its flaw that the Base64 encoder/decoder doesn't have what the CharsetEncoder has which is:

CharsetEncoder.encode(CharBuffer in, ByteBuffer out, boolean endOfInput)

Adam Gent
  • 47,843
  • 23
  • 153
  • 203
0

Try commons-codec library at http://commons.apache.org/codec/ and definitely let us know the results. This is a standard and widely used library.

The class you are looking for is org.apache.commons.codec.binary.Base64 http://commons.apache.org/codec/apidocs/org/apache/commons/codec/binary/Base64.html

Shaun
  • 2,446
  • 19
  • 33
Hurda
  • 4,647
  • 8
  • 35
  • 49
  • 1
    Same processing time to what I'm already using, definitely more functionality with the commons library though. Guess that's as good as it gets then, oh well. – Marty Nov 30 '11 at 19:45