41

I've experienced a problem that was happening using a method reference but not with lambdas. That code was the following:

(Comparator<ObjectNode> & Serializable) SOME_COMPARATOR::compare

or, with lambda,

(Comparator<ObjectNode> & Serializable) (a, b) -> SOME_COMPARATOR.compare(a, b)

Semantically, it is strictly the same, but in practice it is different as in the first case I get an exception in one of the Java serialization classes. My question is not about this exception, because the actual code is running in a more complicated context that has proved to have strange behaviour with serialization, so it would just make it too difficult to answer if I gave any more details.

What I want to understand is the difference between those two ways of creating a lambda expression.

Lii
  • 11,553
  • 8
  • 64
  • 88
Dici
  • 25,226
  • 7
  • 41
  • 82
  • great question, did you try checking the bytecode ? – jmj May 28 '15 at 19:01
  • I'm very unexperienced with bytecode, I could surely learn but I was wondering if some people on SO already have experienced this kind of things – Dici May 28 '15 at 19:03
  • 3
    http://docs.oracle.com/javase/7/docs/technotes/tools/windows/javap.html – jmj May 28 '15 at 19:03
  • 3
    See also: [How to serialize a lambda?](http://stackoverflow.com/questions/22807912/how-to-serialize-a-lambda) – Jesper May 28 '15 at 19:07
  • 1
    I can't reproduce any difference. http://ideone.com/8u8ZuL How do you serialize? What is `SOME_COMPARATOR`? Try to reduce to an MCVE. – Radiodef May 28 '15 at 19:16
  • @Radiodef as I said, the context is pretty complicated. It is Spark that serializes the objects (https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/serializer/JavaSerializer.scala), I can give more details tomorrow but I think it will just make it harder. I don't want to think about the exception right now, but more about the specificities of each syntax to – Dici May 28 '15 at 19:19
  • @Jesper yeah, I first learnt about multicasts with this post. Yet, no difference is mentioned between those two syntaxes – Dici May 28 '15 at 19:22
  • @Radiodef `SOME_COMPARATOR` is just a `Comparator` defined statically on a class – Dici May 28 '15 at 19:25
  • 2
    Java does not require that the class of the object created via the method reference is the same as that of the one created via the lambda, but it does require that both implement all of the interfaces named in the cast. It is conceivable that you're running into an implementation bug in that although the former *implements* `Serializable`, instances turn out not actually to *be* serializable. That can and does sometimes happen when you do not exercise sufficient care in writing your own classes that are intended to be serializable. – John Bollinger May 28 '15 at 19:38
  • @JohnBollinger possible indeed. The error is an `ArrayOutOfBoundException` in some deep (and non-public) code of java serialization, and I had the same issue when trying to serialize `Constructor` (which is not serializable so it so it makes sense that it fails). Using the debugger, I know that it fails somewhere in `ObjectOutputStream.writeObject0` (http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40-b25/java/io/ObjectOutputStream.java#ObjectOutputStream.writeObject0%28java.lang.Object%2Cboolean%29). Unfortunately, I did not have enough time to find the exact line – Dici May 28 '15 at 19:52
  • 1
    @JigarJoshi the decompiled code (of a simple example) is pretty complicated to me... If you want to have a look : http://pastebin.com/rgLQ93er – Dici May 28 '15 at 20:04

2 Answers2

31

Getting Started

To investigate this we start with the following class:

import java.io.Serializable;
import java.util.Comparator;

public final class Generic {

    // Bad implementation, only used as an example.
    public static final Comparator<Integer> COMPARATOR = (a, b) -> (a > b) ? 1 : -1;

    public static Comparator<Integer> reference() {
        return (Comparator<Integer> & Serializable) COMPARATOR::compare;
    }

    public static Comparator<Integer> explicit() {
        return (Comparator<Integer> & Serializable) (a, b) -> COMPARATOR.compare(a, b);
    }

}

After compilation, we can disassemble it using:

javap -c -p -s -v Generic.class

Removing the irrelevant parts (and some other clutter, such as fully-qualified types and the initialisation of COMPARATOR) we are left with

  public static final Comparator<Integer> COMPARATOR;    

  public static Comparator<Integer> reference();
      0: getstatic     #2  // Field COMPARATOR:LComparator;    
      3: dup    
      4: invokevirtual #3   // Method Object.getClass:()LClass;    
      7: pop    
      8: invokedynamic #4,  0  // InvokeDynamic #0:compare:(LComparator;)LComparator;    
      13: checkcast     #5  // class Serializable    
      16: checkcast     #6  // class Comparator    
      19: areturn

  public static Comparator<Integer> explicit();
      0: invokedynamic #7,  0  // InvokeDynamic #1:compare:()LComparator;    
      5: checkcast     #5  // class Serializable    
      8: checkcast     #6  // class Comparator    
      11: areturn

  private static int lambda$explicit$d34e1a25$1(Integer, Integer);
     0: getstatic     #2  // Field COMPARATOR:LComparator;
     3: aload_0
     4: aload_1
     5: invokeinterface #44,  3  // InterfaceMethod Comparator.compare:(LObject;LObject;)I
    10: ireturn

BootstrapMethods:    
  0: #61 invokestatic invoke/LambdaMetafactory.altMetafactory:(Linvoke/MethodHandles$Lookup;LString;Linvoke/MethodType;[LObject;)Linvoke/CallSite;    
    Method arguments:    
      #62 (LObject;LObject;)I    
      #63 invokeinterface Comparator.compare:(LObject;LObject;)I    
      #64 (LInteger;LInteger;)I    
      #65 5    
      #66 0    

  1: #61 invokestatic invoke/LambdaMetafactory.altMetafactory:(Linvoke/MethodHandles$Lookup;LString;Linvoke/MethodType;[LObject;)Linvoke/CallSite;    
    Method arguments:    
      #62 (LObject;LObject;)I    
      #70 invokestatic Generic.lambda$explicit$df5d232f$1:(LInteger;LInteger;)I    
      #64 (LInteger;LInteger;)I    
      #65 5    
      #66 0

Immediately we see that the bytecode for the reference() method is different to the bytecode for explicit(). However, the notable difference isn't actually relevant, but the bootstrap methods are interesting.

An invokedynamic call site is linked to a method by means of a bootstrap method, which is a method specified by the compiler for the dynamically-typed language that is called once by the JVM to link the site.

(Java Virtual Machine Support for Non-Java Languages, emphasis theirs)

This is the code responsible for creating the CallSite used by the lambda. The Method arguments listed below each bootstrap method are the values passed as the variadic parameter (i.e. args) of LambdaMetaFactory#altMetaFactory.

Format of the Method arguments

  1. samMethodType - Signature and return type of method to be implemented by the function object.
  2. implMethod - A direct method handle describing the implementation method which should be called (with suitable adaptation of argument types, return types, and with captured arguments prepended to the invocation arguments) at invocation time.
  3. instantiatedMethodType - The signature and return type that should be enforced dynamically at invocation time. This may be the same as samMethodType, or may be a specialization of it.
  4. flags indicates additional options; this is a bitwise OR of desired flags. Defined flags are FLAG_BRIDGES, FLAG_MARKERS, and FLAG_SERIALIZABLE.
  5. bridgeCount is the number of additional method signatures the function object should implement, and is present if and only if the FLAG_BRIDGES flag is set.

In both cases here bridgeCount is 0, and so there is no 6, which would otherwise be bridges - a variable-length list of additional methods signatures to implement (given that bridgeCount is 0, I'm not entirely sure why FLAG_BRIDGES is set).

Matching the above up with our arguments, we get:

  1. The function signature and return type (Ljava/lang/Object;Ljava/lang/Object;)I, which is the return type of Comparator#compare, because of generic type erasure.
  2. The method being called when this lambda is invoked (which is different).
  3. The signature and return type of the lambda, which will be checked when the lambda is invoked: (LInteger;LInteger;)I (note that these aren't erased, because this is part of the lambda specification).
  4. The flags, which in both cases is the composition of FLAG_BRIDGES and FLAG_SERIALIZABLE (i.e. 5).
  5. The amount of bridge method signatures, 0.

We can see that FLAG_SERIALIZABLE is set for both lambdas, so it's not that.

Implementation methods

The implementation method for the method reference lambda is Comparator.compare:(LObject;LObject;)I, but for the explicit lambda it's Generic.lambda$explicit$df5d232f$1:(LInteger;LInteger;)I. Looking at the disassembly, we can see that the former is essentially an inlined version of the latter. The only other notable difference is the method parameter types (which, as mentioned earlier, is because of generic type erasure).

When is a lambda actually serializable?

You can serialize a lambda expression if its target type and its captured arguments are serializable.

Lambda Expressions (The Java™ Tutorials)

The important part of that is "captured arguments". Looking back at the disassembled bytecode, the invokedynamic instruction for the method reference certainly looks like it's capturing a Comparator (#0:compare:(LComparator;)LComparator;, in contrast to the explicit lambda, #1:compare:()LComparator;).

Confirming capturing is the issue

ObjectOutputStream contains an extendedDebugInfo field, which we can set using the -Dsun.io.serialization.extendedDebugInfo=true VM argument:

$ java -Dsun.io.serialization.extendedDebugInfo=true Generic

When we try to serialize the lambdas again, this gives a very satisfactory

Exception in thread "main" java.io.NotSerializableException: Generic$$Lambda$1/321001045
        - element of array (index: 0)
        - array (class "[LObject;", size: 1)
/* ! */ - field (class "invoke.SerializedLambda", name: "capturedArgs", type: "class [LObject;") // <--- !!
        - root object (class "invoke.SerializedLambda", SerializedLambda[capturingClass=class Generic, functionalInterfaceMethod=Comparator.compare:(LObject;LObject;)I, implementation=invokeInterface Comparator.compare:(LObject;LObject;)I, instantiatedMethodType=(LInteger;LInteger;)I, numCaptured=1])
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1182)
    /* removed */
    at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
    at Generic.main(Generic.java:27)

What's actually going on

From the above, we can see that the explicit lambda is not capturing anything, whereas the method reference lambda is. Looking over the bytecode again makes this clear:

  public static Comparator<Integer> explicit();
      0: invokedynamic #7,  0  // InvokeDynamic #1:compare:()LComparator;    
      5: checkcast     #5  // class java/io/Serializable    
      8: checkcast     #6  // class Comparator    
      11: areturn

Which, as seen above, has an implementation method of:

  private static int lambda$explicit$d34e1a25$1(java.lang.Integer, java.lang.Integer);
     0: getstatic     #2  // Field COMPARATOR:Ljava/util/Comparator;
     3: aload_0
     4: aload_1
     5: invokeinterface #44,  3  // InterfaceMethod java/util/Comparator.compare:(Ljava/lang/Object;Ljava/lang/Object;)I
    10: ireturn

The explicit lambda is actually calling lambda$explicit$d34e1a25$1, which in turn calls the COMPARATOR#compare. This layer of indirection means it's not capturing anything that isn't Serializable (or anything at all, to be precise), and so is safe to serialize. The method reference expression directly uses COMPARATOR (the value of which is then passed to the bootstrap method):

  public static Comparator<Integer> reference();
      0: getstatic     #2  // Field COMPARATOR:LComparator;    
      3: dup    
      4: invokevirtual #3   // Method Object.getClass:()LClass;    
      7: pop    
      8: invokedynamic #4,  0  // InvokeDynamic #0:compare:(LComparator;)LComparator;    
      13: checkcast     #5  // class java/io/Serializable    
      16: checkcast     #6  // class Comparator    
      19: areturn

The lack of indirection means that COMPARATOR must be serialized along with the lambda. As COMPARATOR does not refer to a Serializable value, this fails.

The fix

I hesitate to call this a compiler bug (I expect the lack of indirection serves as an optimisation), although it is very strange. The fix is trivial, but ugly; adding the explicit cast for COMPARATOR at declaration:

public static final Comparator<Integer> COMPARATOR = (Serializable & Comparator<Integer>) (a, b) -> a > b ? 1 : -1;

This makes everything perform correctly on Java 1.8.0_45. It's also worth noting that the eclipse compiler produces that layer of indirection in the method reference case as well, and so the original code in this post does not require modification to execute correctly.

  • 1
    Great answer ! Having struggled with serialization those last few days, I'm not surprised by your answer but I would never be able to bring facts to prove it. It is indeed very easy to make a reference to a non serializable object in a lambda. This particular case was very subtle, thanks ! – Dici Jun 01 '15 at 18:46
  • 1
    In the listing of `reference`, is it correct that the result of the call to `getClass` is just discarded (with the `pop` instruction)? Do you have any explanation for this? Is `getClass` called only to trigger a NPE if `COMPARATOR` is null? – Lii Nov 28 '15 at 00:02
  • 2
    @Lii it's correct, yep. I haven't found an official explanation, but forcing an NPE is given as the reason [here](http://openjdk.5641.n7.nabble.com/Calls-to-getClass-when-using-method-references-td172501.html), by an active JDK contributor. –  Nov 29 '15 at 00:32
  • 2
    @Lii there has been a full discussion of this `getClass()` call later on in [In Java Lambda's why is getClass() called on a captured variable?](https://stackoverflow.com/q/43115645/2711488) – Holger Apr 07 '20 at 07:52
14

I want to add the fact that there is actually a semantic difference between a lambda and a method reference to an instance method (even when they have the same content as in your case, and disregarding serialisation):

SOME_COMPARATOR::compare

This form evaluates to a lambda object which is closed over the value of SOME_COMPARATOR at evaluation time (that is, it contains reference to that object). It will check if SOME_COMPARATOR is null at evaluation time and throw a null pointer exception already then. It will not pick up changes to the field that are made after its creation.

(a,b) -> SOME_COMPARATOR.compare(a,b)

This form evaluates to a lambda object which will access the value of the SOME_COMPARATOR field when called. It is closed over this, since SOME_COMPARATOR is an instance field. When called it will access the current value of SOME_COMPARATOR and use that, potentially throwing an null pointer exception at that time.

Demonstration

This behaviour can be seen from the following small example. By stopping the code in a debugger and inspecting the fields of the lambdas one can verify what they are closed over.

Object o = "First";

void run() {
    Supplier<String> ref = o::toString; 
    Supplier<String> lambda = () -> o.toString();
    o = "Second";
    System.out.println("Ref: " + ref.get()); // Prints "First"
    System.out.println("Lambda: " + lambda.get()); // Prints "Second"
}

Java Language Specification

The JLS describes this behaviour of method references in 15.13.3:

The target reference is the value of ExpressionName or Primary, as determined when the method reference expression was evaluated.

And:

First, if the method reference expression begins with an ExpressionName or a Primary, this subexpression is evaluated. If the subexpression evaluates to null, a NullPointerException is raised

In Tobys code

This can be seen in Tobys listing of the code of reference, where getClass is called on the value of SOME_COMPARATOR which will trigger an exception if it is null:

4: invokevirtual #3   // Method Object.getClass:()LClass;

(Or so I think, I'm really not an expert on byte code.)

Method references in code that is complied with Eclipse 4.4.1 does not throw an exception in that situation however. Eclipse seems to have a bug here.

Lii
  • 11,553
  • 8
  • 64
  • 88
  • Good addition, thanks :) I'm quite intrigued about the Eclipse bug you're talking about – Dici Nov 27 '15 at 23:57
  • @Dici: I did some more investigation, and it turns out the thing with NPE:s on creation is only a consequence of much more important difference! I've updated the answer. This has been very interesting to learn about! – Lii Nov 28 '15 at 11:55
  • This difference in the semantic is actually pretty obvious once you know it. The code you've shown does what I would expect it to do. So actually the reason why the lambda is correctly serialized in this question is that it delays the evaluation of the expression, and because the comparator is static it can be accessed on the machine received the dezerialized object since it has loaded the same classes than the machine that sent the lambda. Makes a lot of sense :) Thanks for highlighting this – Dici Nov 28 '15 at 17:59
  • @Dici The [Eclipse bug](https://bugs.eclipse.org/bugs/show_bug.cgi?id=521182) has been fixed in Eclipse 4.8. – Lii Oct 13 '18 at 14:53
  • Nice to know :p I have fully converted to IntelliJ though! It's so good... – Dici Oct 13 '18 at 21:12