9

Although it is possible to serialize a lambda in Java 8, it is strongly discouraged; even serializing inner classes is discouraged. The reason given is that lambdas may not deserialize properly on another JRE. However, doesn't this mean that there is a way to safely serialize a lambda?

For example, say I define a class to be something like this:

public class MyClass {
    private String value;
    private Predicate<String> validateValue;

    public MyClass(String value, Predicate<String> validate) {
        this.value = value;
        this.validateValue = validate;
    }

    public void setValue(String value) {
        if (!validateValue(value)) throw new IllegalArgumentException();
        this.value = value;
    }

    public void setValidation(Predicate<String> validate) {
        this.validateValue = validate;
    }
}

If I declared an instance of the class like this, I should not serialize it:

MyClass obj = new MyClass("some value", (s) -> !s.isEmpty());

But what if I made an instance of the class like this:

// Could even be a static nested class
public class IsNonEmpty implements Predicate<String>, Serializable {
    @Override
    public boolean test(String s) {
        return !s.isEmpty();
    }
}
MyClass isThisSafeToSerialize = new MyClass("some string", new IsNonEmpty());

Would this now be safe to serialize? My instinct says that yes, it should be safe, since there's no reason that interfaces in java.util.function should be treated any differently from any other random interface. But I'm still wary.

Community
  • 1
  • 1
Justin
  • 24,288
  • 12
  • 92
  • 142
  • 1
    Interfaces are completely irrelevant for Serialization, so yes, implementing `Predicate` has the same impact as implementing any other interface, *none*. But your assumption that lambdas may not deserialize properly on another JRE is wrong. The have a [well defined persistent representation](https://docs.oracle.com/javase/8/docs/api/?java/lang/invoke/SerializedLambda.html). – Holger Jun 24 '16 at 17:02
  • @Holger Then why does the [oracle docs](https://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html#serialization) seem to suggest that they don't? – Justin Jun 24 '16 at 17:18
  • 1
    Well, it contains a reference to the [inner class related problems with serialization](https://docs.oracle.com/javase/tutorial/java/javaOO/nested.html#serialization). In short, it’s potentially creating compiler dependencies, not JRE specific issues. Granted, the text is there is a bit misleading. And mind the danger of accidentally serializing captured values of the surrounding context, including `this`… – Holger Jun 24 '16 at 17:30

1 Answers1

15

It depends on which kind of safety you want. It’s not the case that serialized lambdas cannot be shared between different JREs. They have a well defined persistent representation, the SerializedLambda. When you study, how it works, you’ll find that it relies on the presence of the defining class, which will have a special method that reconstructs the lambda.

What makes it unreliable is the dependency to compiler specific artifacts, e.g. the synthetic target method, which has some generated name, so simple changes like the insertion of another lambda expression or recompiling the class with a different compiler can break the compatibility to existing serialized lambda expression.

However, using manually written classes isn’t immune to this. Without an explicitly declared serialVersionUID, the default algorithm will calculate an id by hashing class artifacts, including private and synthetic ones, adding a similar compiler dependency. So the minimum to do, if you want reliable persistent forms, is to declare an explicit serialVersionUID.

Or you turn to the most robust form possible:

public enum IsNonEmpty implements Predicate<String> {
    INSTANCE;

    @Override
    public boolean test(String s) {
        return !s.isEmpty();
    }
}

Serializing this constant does not store any properties of the actual implementation, besides its class name (and the fact that it is an enum, of course) and a reference to the name of the constant. Upon deserialization, the actual unique instance of that name will be used.


Note that serializable lambda expressions may create security issues because they open an alternative way of getting hands on an object that allows to invoke the target methods. However, this applies to all serializable classes, as all variant shown in your question and this answer allow to deliberately deserialize an object allowing to invoke the encapsulated operation. But with explicit serializable classes, the author is usually more aware of this fact.

Community
  • 1
  • 1
Holger
  • 285,553
  • 42
  • 434
  • 765
  • I don't see any danger with the enum; when serialized, it's basically just the class name and instance name. How is that a security issue? Is the problem basically that an attacker could get access to `INSTANCE` when I didn't intend for that to happen? – Justin Jun 24 '16 at 18:18
  • 2
    Exactly. Making a class serializable is like adding an additional `public` constructor (or accessor), which could be used even when the class itself is not `public`. in combination with a common interface like `Predicate`, it implies providing access to the encapsulated operation. If the operation itself is not critical, there’s no issue. – Holger Jun 27 '16 at 09:04