13

During working on sockets and serializing objects over them, I noticed that there are some 3rd party libraries for faster object serialization on Java such as Kryo and FST. Up to now, I expected that Java's serialization is optimized and the fastest. Because, it is language dependent and gives a low level solution that is expected to be faster. However, the considered libraries claim that they are faster than Java.

Can someone explain why Java could not provide the fastest serialization solution? For the sake of what does it give up a better performance?

Thanks in advance.

ovunccetin
  • 8,443
  • 5
  • 42
  • 53
  • 1
    For one thing, neither of those support versioning out of the box, so you're not comparing like with like. (Kryo supports it with extra code.) – user207421 Oct 18 '13 at 11:05
  • Related: http://stackoverflow.com/questions/239280/which-is-the-best-alternative-for-java-serialization – Christophe Roussy Oct 18 '13 at 13:46
  • JDK does not support versioning, it just manages to throw an exception, which is not very helpful. – R.Moeller Oct 18 '13 at 18:45
  • @R.Moeller Rubbish. There is a whole chapter on it in the Object Versioning Specification. – user207421 Oct 18 '13 at 20:47
  • Ok, was a somewhat exaggerated comment by me. However if you look atz the list of "incompatible changes" its pretty long ;-) – R.Moeller Oct 19 '13 at 11:10
  • @R.Moeller If you look at the list of compatible changes, it is longer than what either of the other two systems mentioned do out of the box, i.e. zero, which is the only point at issue. – user207421 Oct 19 '13 at 11:30
  • 3
    @EJP well, I'd like to point out that the cost/reward ratio of the versioning implementation is pretty bad. In real world practice, one needs to handle versioning manually 99% of time. Its not too hard to imagine versiong support schemes, which do not affect performance at all. – R.Moeller Oct 19 '13 at 11:43
  • fst 2.x supports versioning (adding fields) – R.Moeller Apr 24 '15 at 01:10

2 Answers2

20

There are several reasons (i am the author of http://code.google.com/p/fast-serialization/)

Reasons:

  • crawls up the Class hierarchy for each Object doing several calls to read/writeObject per Object in case.
  • Partially poor coding (improved with 1.7)
  • Some often used classes make use of old slow + outdated serialization features such as putfield/getfield etc.
  • Too much temporary Object allocation
  • A lot of validation (versioning, implemented interfaces)
  • Slow Java Input/Output streams
  • Reflection to set/get field values.
  • use of JDK collections requiring "big numbers" such as Integer or Long instead of primitives.
  • implementation lacks certain algorithmic optimizations :-)
  • primitives are reordered into network byte order (in java code, not native) on x86.

In order to give better performance, they would have to give up support of old versioning schemes (e.g. the way read/writeObject currently works is suboptimal), and make some things such as versioning support optional or choose more performance sensitive approaches to that (would be possible). Additionally HotSpot might add some intrinsics to improve low level handling of primitives. One needs to have performance in mind when designing an API, which was probably not the case with JDK Serialization.

R.Moeller
  • 3,436
  • 1
  • 17
  • 12
3

Java serialization is slow because it uses reflection. JDK serialization does a lot of backward compatibility checking and strict type checking. But java serialization garneted 100% same object after deserialization in most of the case.