2

I have a simple Java class:

import java.io.Serializable;

public class SimpleClass implements Serializable {    
    private static final long serialVersionUID = -9062339996046009959L;
    public byte Byte;
    public int id;

    public SimpleClass(byte Byte, int id) {
        this.Byte = Byte;
        this.id = id;
    }
}

Since a int is 4 bytes, and a byte is well a byte, shouldn't this class require 5 bytes?

But when I serialize it and find the length of the byte array I get 59,

Why does this occur?

Also I am using this to convert my object into a byte array.

GhostCat
  • 137,827
  • 25
  • 176
  • 248
  • which byte array ? – azro Jul 25 '17 at 09:28
  • Because serializing encodes more than just the data present in the class. And that's a key point here: this is a class, not a C structure. For more information, you can look at the [protocol specification](https://docs.oracle.com/javase/8/docs/platform/serialization/spec/protocol.html). There's other information available here: https://docs.oracle.com/javase/8/docs/technotes/guides/serialization/index.html – John Szakmeister Jul 25 '17 at 09:31

3 Answers3

3

Simple: Java serializes much more than just the field values. See the corresponding grammar specification.

You notice for example that serialVersionUID? That has to go into the binary data as well. And of course - the complete class name (absolute - including packages) as well.

Keep in mind: the idea is that you can serialize arbitrary (serializable) objects into a stream of bytes). When you de-serialize you don't specify the type of all the objects in such a byte stream!

In other words: these bytes must contain all the information that is required to resurrect the contained object instance(s). Thus you need the full class name; and if present, that servialVersionUID for consistency checks.

GhostCat
  • 137,827
  • 25
  • 176
  • 248
0

Because serialization supports schema evolution. That is, an object can be deserialized even if the definition of its class has since been changed. This is necessary to allow long-term storage of serialized objects.

To enable this, the serialization stream encodes metadata about the structure of serialized objects. This info is written only once per stream, i.e. if the stream contains many objects of the same class, the class metadata is written only once.

For serialization to support polymorphism (you may send an object of any subtype the receiver expects), this metadata also includes the fully qualified name of the class.

To alert you of incompatible changes to the structure of your data, the metadata also contains a checksum for the class definition (the serialVersionUID).

Note that schema evolution is a pretty standard feature of any serialization protocol. For instance, JSON and XML write field names for every object being marshalled:

{Byte:0,id:42}

(14 bytes)

<SimpleClass><Byte>0</Byte><id>42</id></SimpleClass>

(62 bytes)

meriton
  • 68,356
  • 14
  • 108
  • 175
0

Because the data that is serialized for a class consists of more than just the field values. Specifically:

  1. A type word.
  2. Either a class descriptor or a backwards reference to one.
  3. Serialization information for all the Serializable base classes.
  4. The name of every serialized field.
  5. Its value, preceded by another type word.

At least.

user207421
  • 305,947
  • 44
  • 307
  • 483