With encapsulation we pretend that nothing is revealed about the internal representation of an object, and we interact with our components only through their public interfaces; a desirable attribute that we usually exploit later when we want to change the internal representation of data in a component without breaking any code from its users.
Conversely, serialization implies exposing the internal state of an object by transforming the object’s state into some other format that can be stored and resurrected later. This means that, once serialized, the internal structure of an object cannot be changed without risking the success of this resurrection process.
The problems with serialization could appear not only in the cases of open systems but also in distributed systems that somehow rely on it. For example, if we stop our application server, it may choose to serialize the objects in the current session to resurrect them later, when the server is restarted, but if we redeploy our application using new versions of our serializable objects, will they still be compatible when the server attempts to resurrect them? In a distributed system is common to use code mobility, namely, sets of classes are located in a central repository available for clients and server to share common code. In this approach, since objects are serialized to be shared between clients and servers, do we run the risk of breaking anything if we update the serializable classes in this common repository?
Consider for example that we had a class Person as follows:
public class Person {
private String firstName;
private String lastName;
private boolean isMale;
private int age;
public boolean isMale() {
return this.isMale;
}
public int getAge() {
return this.age;
}
//more getters and setters
}
Let’s say that we released our first version of our API with this abstraction of a Person. For the second version, though, we would like to introduce two changes: first, we discovered that it would be better if we could store the date of birth of a person, instead of the age as an integer, and second our definition of the class Person may have occurred when Java did not have enumerations but now we would like to use them to represent the gender of a person.
Evidently, since the fields are properly encapsulated, we could change the inner workings of the class without affecting the public interface. Somewhat like this:
public class Person {
private String firstName;
private String lastName;
private Gender gender;
private Date dateOfBirth;
public boolean isMale() {
return this.gender == Gender.MALE;
}
public int getAge() {
Calendar today = Calendar.getInstance();
Calendar birth = Calendar.getInstance();
birth.setTime(this.dateOfBirth);
return today.get(Calendar.YEAR) - birth.get(Calendar.YEAR);
}
//the rest of getters and setters
}
By doing these changes as shown above we can make sure preexisting clients will not break, because even when we changed the internal representation of the state of the object, we kept the public interface unchanged.
However, consider that the class Person was serializable by default, and if our system is an open system, there could be thousands of lines of code out there relying on the fact that they will be capable of resurrecting serialized objects based on the original class, or maybe even clients who serialized extended classes based on the original version of the class as their parent. Some of these objects may have been serialized to binary form, or some other format, by the users of our API, who now, would like to to evolve to our second version of the code.
Then if we wanted to do some changes as we did in our second example, we would immediately break some of them; all those having serialized objects based on the original version of the class who have stored objects containing a field called age of type int, containing the age of a person, and field named isMale of type boolean containing information about the gender are likely to fail during the deserialization of these objects because the new class definition uses new fields and new data types.
Clearly our problem here is that the serialization has exposed sensitive information about our objects, and now we cannot simply change anything, not even what we thought that was encapsulated because through serialization, everything has been exposed publicly.
Now, consider a scenario in which every single class in the JDK API were serializable by default. The designers of Java simply could not evolve the APIs of Java without risking to break many applications. They would be forced to assume that somebody out there may have a serialized version of any of the classes in the JDK.
There are ways to deal with the evolution of serializable classes, but the important point here is that, when it comes to encapsulation, we would like to keep our serializable classes as contained as possible and for those classes that we indeed need to serialize, then we may need to ponder about the implications of any possible scenario in which we may attempt to resurrect an object using an evolved version of its class.
Despite all of this, serialization has security implications as well, as important, sensitive information about our objects could be easily exposed.
Therefore, having the classes that are serializable marked, kind of makes it easier for the designers of APIs to deal with them.