91

Say I have a List of object which were defined using lambda expressions (closures). Is there a way to inspect them so they can be compared?

The code I am most interested in is

    List<Strategy> strategies = getStrategies();
    Strategy a = (Strategy) this::a;
    if (strategies.contains(a)) { // ...

The full code is

import java.util.Arrays;
import java.util.List;

public class ClosureEqualsMain {
    interface Strategy {
        void invoke(/*args*/);
        default boolean equals(Object o) { // doesn't compile
            return Closures.equals(this, o);
        }
    }

    public void a() { }
    public void b() { }
    public void c() { }

    public List<Strategy> getStrategies() {
        return Arrays.asList(this::a, this::b, this::c);
    }

    private void testStrategies() {
        List<Strategy> strategies = getStrategies();
        System.out.println(strategies);
        Strategy a = (Strategy) this::a;
        // prints false
        System.out.println("strategies.contains(this::a) is " + strategies.contains(a));
    }

    public static void main(String... ignored) {
        new ClosureEqualsMain().testStrategies();
    }

    enum Closures {;
        public static <Closure> boolean equals(Closure c1, Closure c2) {
            // This doesn't compare the contents 
            // like others immutables e.g. String
            return c1.equals(c2);
        }

        public static <Closure> int hashCode(Closure c) {
            return // a hashCode which can detect duplicates for a Set<Strategy>
        }

        public static <Closure> String asString(Closure c) {
            return // something better than Object.toString();
        }
    }    

    public String toString() {
        return "my-ClosureEqualsMain";
    }
}

It would appear the only solution is to define each lambda as a field and only use those fields. If you want to print out the method called, you are better off using Method. Is there a better way with lambda expressions?

Also, is it possible to print a lambda and get something human readable? If you print this::a instead of

ClosureEqualsMain$$Lambda$1/821270929@3f99bd52

get something like

ClosureEqualsMain.a()

or even use this.toString and the method.

my-ClosureEqualsMain.a();
Stuart Marks
  • 127,867
  • 37
  • 205
  • 259
Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • 1
    You can define toString, equals and hashhCode methods within closure. – Ankit Zalani Jun 07 '14 at 09:45
  • @AnkitZalani Can you give an example which compiles? – Peter Lawrey Jun 07 '14 at 09:46
  • @PeterLawrey, Since `toString` is defined on `Object`, I think you can define an interface that provides a default implementation of `toString` without violating the *single-method* requirement for interfaces to be functional. I haven't checked this though. – Mike Samuel Jun 07 '14 at 15:45
  • 8
    @MikeSamuel That's incorrect. Classes do not inherit default Object methods declared in interfaces; see http://stackoverflow.com/questions/24016962/java8-why-is-it-forbidden-to-define-a-default-method-for-a-method-from-java-lan/24026292#24026292 for explanation. – Brian Goetz Jun 08 '14 at 00:20

3 Answers3

96

This question could be interpreted relative to the specification or the implementation. Obviously, implementations could change, but you might be willing to rewrite your code when that happens, so I'll answer at both.

It also depends on what you want to do. Are you looking to optimize, or are you looking for ironclad guarantees that two instances are (or are not) the same function? (If the latter, you're going to find yourself at odds with computational physics, in that even problems as simple as asking whether two functions compute the same thing are undecidable.)

From a specification perspective, the language spec promises only that the result of evaluating (not invoking) a lambda expression is an instance of a class implementing the target functional interface. It makes no promises about the identity, or degree of aliasing, of the result. This is by design, to give implementations maximal flexibility to offer better performance (this is how lambdas can be faster than inner classes; we're not tied to the "must create unique instance" constraint that inner classes are.)

So basically, the spec doesn't give you much, except obviously that two lambdas that are reference-equal (==) are going to compute the same function.

From an implementation perspective, you can conclude a little more. There is (currently, may change) a 1:1 relationship between the synthetic classes that implement lambdas, and the capture sites in the program. So two separate bits of code that capture "x -> x + 1" may well be mapped to different classes. But if you evaluate the same lambda at the same capture site, and that lambda is non-capturing, you get the same instance, which can be compared with reference equality.

If your lambdas are serializable, they'll give up their state more easily, in exchange for sacrificing some performance and security (no free lunch.)

One area where it might be practical to tweak the definition of equality is with method references because this would enable them to be used as listeners and be properly unregistered. This is under consideration.

I think what you're trying to get to is: if two lambdas are converted to the same functional interface, are represented by the same behavior function, and have identical captured args, they're the same

Unfortunately, this is both hard to do (for non-serializable lambdas, you can't get at all the components of that) and not enough (because two separately compiled files could convert the same lambda to the same functional interface type, and you wouldn't be able to tell.)

The EG discussed whether to expose enough information to be able to make these judgments, as well as discussing whether lambdas should implement more selective equals/hashCode or more descriptive toString. The conclusion was that we were not willing to pay anything in performance cost to make this information available to the caller (bad tradeoff, punishing 99.99% of users for something that benefits .01%).

A definitive conclusion on toString was not reached but left open to be revisited in the future. However, there were some good arguments made on both sides on this issue; this is not a slam-dunk.

Naman
  • 27,789
  • 26
  • 218
  • 353
Brian Goetz
  • 90,105
  • 23
  • 150
  • 161
  • 4
    +1 While I understand supporting `==` equality is a hard problem to solve generally, I would have thought there would be simple cases where the compiler, if not the JVM could recognise that `this::a` on one line is the same as `this::a` on another line. In fact it is still not obvious to me what you gain by giving every call site it's own implementation. Perhaps they can be optimised differently, but I would have thought inlining could do this.?? – Peter Lawrey Jun 07 '14 at 23:17
  • In any case, you have a bound object and a method to call and this would be enough for equality in this simple case, but I understand that a general solution is hard, and inconsistent behaviour does cause confusion e.g. Integer cache means some autoboxed references are == and others require equals(). – Peter Lawrey Jun 07 '14 at 23:20
  • 1
    Like `Array` and `Arrays` utility claases for arrays as they couldn't get a decent equals, hashCode or toString, I can imagine a `Closures` utility class one day. As there are languages where you can print and array and see its content, I imagine there is languages where you can print closures and get some insight as to what the closure does. (Possibly a String of the code is better but unsatisfactory to some) – Peter Lawrey Jun 07 '14 at 23:24
  • 8
    We investigated a number of possible implementations, including one one in which the proxy classes were shared across callsites. The one we went with for now (one big benefit of the "metafactory" approach is that this can be changed without recompiling user classfiles) was the simplest and best-performing. We'll continue to monitor relative performance between options as the VM evolves, and when one of the others is faster, we'll switch. – Brian Goetz Jun 08 '14 at 00:18
  • 2
    As a side note, even the `MethodHandle`s used in the underlying binary interface [are not guaranteed to be canonical](http://docs.oracle.com/javase/specs/jvms/se8/html/jvms-5.html#jvms-5.4.3.5-400). So the meta factory can’t tell whether two references are targeting the same method by just comparing the references. It had to perform deeper analysis if it tried to return the same implementation for equivalent method references. – Holger Jun 10 '14 at 11:21
  • Has there been any change up to now, or will be in Java 9, that merits an update on this answer? – user1803551 Feb 11 '17 at 16:54
  • 6
    No changes for Java 9. – Brian Goetz Feb 11 '17 at 17:22
  • Does your answer imply that implementing hashCode and equals makes no sense on a immutable class that takes only a lambda as delegate? (e.g. when you write an adapter that uses a lambda) – mmirwaldt Apr 18 '17 at 22:56
  • 1
    @mmirwaldt: there are no classes “that take[s] only a lambda as delegate”. Classes take implementations of a (functional) interface as delegate, without knowing how these interfaces are implemented, hence, without knowing whether the implementation will have a meaningful hashCode/equals implementation. When you say, e.g. `Optional.of(Object::toString)` or `Collections.singleton(s -> s.length())`, you already get instances of classes with hashCode/equals implementations wrapping an object of unspecified identity. Whether this is a problem depends on what you are going to do with it… – Holger Oct 09 '17 at 07:01
  • regarding the example `Strategy a = (Strategy) this::a; // lets say this is ot type X` I would like to be able to do something like this: `Object o = MethodReferences.extractTargetObject(a); // == this` and `java.lang.reflect.Method m = MethodReferences.extractTargetMethod(a); // == X.a` (or instead a MethodHandle bound to the target object + a way to get it from it)? – bodrin Nov 22 '18 at 18:39
  • or why not the make the lambda class that corresponds to the method reference to implement a special interface like `MethodReference { Object getTargetObject(); Method getTargetMethod(); }` ? – bodrin Nov 26 '18 at 15:31
  • getTargetObject() can return non null result for the "Reference to an instance method of a particular object" and null for the other 3 kinds of method refs defined here https://docs.oracle.com/javase/tutorial/java/javaOO/methodreferences.html – bodrin Nov 26 '18 at 15:44
  • 2
    @bodrin I presume your `getTargetObject()` method would return the receiver for a bound reference? Did it occur to you that this would be a significant security hole? When you capture a bound method reference and pass it to a method, users would be quite surprised to find they have also shared the unbound receiver. – Brian Goetz Nov 27 '18 at 15:57
  • @Brian, I see your point about the security hole. If there is no way that this currently can be workarounded then I guess its better to not allow it into the future. And then we can at least have getTargetMethod(), right? – bodrin May 31 '19 at 10:31
  • @BrianGoetz Drools has a variant on `Function` and `BiFunction` that implement `Serializable` so they can detect if 2 lambda's are the same through serialization. That allows them to do node sharing for performance gains. Is there a better way to detect if 2 lambda's are the same so we can use the normal Function and BiFunction? – Geoffrey De Smet Mar 26 '21 at 14:53
  • @BrianGoetz Thank you for answering. The Drools teams believed they were doing a good thing, by using this to enable "node sharing", which allows multiple rules calling `.filter(Predicate)` on the same objects, to reuse the same RETE node network. This has been shown to bring high performance gains. We, at the OptaPlanner sibling team, now face the same choice (currently we're still using normal Function/BiFunction). The security-for-performance tradeoff of serializable lambda's (as also mentioned in your original answer) isn't yet understood by neither teams I believe. – Geoffrey De Smet Mar 29 '21 at 11:56
  • Personally, I favor sticking with the normal (non-serializable) `Function`/`BiFunction`, even if it's just for standarization, but we're working on quantifying the performance loss due to lack of node sharing, to make a good decision here. Any hope to one day see a `Functions.equals(Function, Function)` in the JDK - to have our cake and eat it too - is welcome of course :) – Geoffrey De Smet Mar 29 '21 at 12:01
  • 3
    @GeoffreyDeSmet I am sure it was done with good intentions, but it happens so often that developers perceive they have "no choice" when performance is pitted against safety or maintainability. This is, as you suggest, likely amplified by the fact that the security concerns are not as well understood (though, when performance is concerned, it is sometimes hard to get people to listen to the other concerns.) – Brian Goetz Mar 29 '21 at 19:28
  • I agree. I presume this answer by Stuart details the security risks of lambda serialization: https://stackoverflow.com/a/39091232/472109 – Geoffrey De Smet Mar 31 '21 at 07:54
  • 3
    @GeoffreyDeSmet the missing possibility to validate data is one problem, but since lambda expressions and method references primarily encapsulate *behavior*, the bigger problem is that serialization allows to get access to that behavior for arbitrary (untrusted code). [This question](https://stackoverflow.com/q/25443655/2711488) contains a practical example. – Holger Apr 14 '21 at 07:25
  • Thanks @Holger - that's an interesting exploit – Geoffrey De Smet Apr 14 '21 at 07:47
  • 2
    @GeoffreyDeSmet And not just un-encapsulating behavior. If a lambda captures a local, for a non-serializable lambda that local is encapsulated, but for a serializable lambda, it is effectively public. So think about `s -> s.equals(theSecretPassword)`. – Brian Goetz Apr 14 '21 at 13:10
  • 7 years and several java versions later, still no solution. Mr. @BrianGoetz, we need this. In certain lines of work, (e.g. GUI development,) lambdas are predominantly used as event listeners, which need to be registered as well as unregistered, so Java's current behavior causes seemingly inexplicable event listener removal failures, which is outright treacherous. – Mike Nakis May 01 '21 at 15:35
  • 2
    @MikeNakis that’s the fault of whoever didn’t store the reference of the listener to deregister. There is no difference to how it worked before the introduction of lambda expressions, e.g. when using anonymous inner classes. – Holger May 25 '21 at 07:48
  • @Holger I am surprised you are saying this, because there is in fact a big difference. `foo(myAnonymousInnerClassInstanceImplementingMyInterface);` will receive the same instance each time it is called. However, as Java currently works, (as of Java 16 today,) `foo(this::myMethodImplementingMyInterface);` will receive a different instance each time it is called. – Mike Nakis May 25 '21 at 08:16
  • 2
    @MikeNakis because, regardless of the distracting long name, `myAnonymousInnerClassInstanceImplementingMyInterface` is *a variable containing a reference* to the instance. It’s irrelevant how the instance was created, you need such a variable holding a reference, to be able to use it twice, for registering and unregistering. You can’t use `register(new MyListener() { … })` with `unregister(new MyListener() { … })` and you can’t use `register(this::name)` with `unregister(this::name)`. Nothing has changed. Except the variable name, if you truly use names like in your example… – Holger May 25 '21 at 08:35
  • @Holger sorry I distracted you with the long name. It must be preventing you from seeing the problem. It must be making you see the keyword `new` in places where there is none. – Mike Nakis May 25 '21 at 09:48
  • 2
    @MikeNakis whether you instantiate an object using `new` or via a lambda expression, is irrelevant. All that matters, is, *you must store the result in a variable* to use it twice, i.e. for registering and unregistering. Of course, there is the keyword `new` when creating an anonymous inner class. Feel free to explain how you would use an anonymous inner class without it… – Holger May 25 '21 at 10:04
7

To compare labmdas I usually let the interface extend Serializable and then compare the serialized bytes. Not very nice but works for the most cases.

KIC
  • 5,887
  • 7
  • 58
  • 98
  • The same applies to hashCode of lambdas, doesn't it? I mean serializing a lambda to a byte array (with the help of ByteArrayOutputStream and ObjectOutputStream) and hashing it by Arrays.hash(...). – mmirwaldt Apr 06 '19 at 13:11
6

I don't see a possibility, to get those informations from the closure itself. The closures doesn't provide state.

But you can use Java-Reflection, if you want to inspect and compare the methods. Of course that is not a very beautiful solution, because of the performance and the exceptions, which are to catch. But this way you get those meta-informations.

F. Böller
  • 4,194
  • 2
  • 20
  • 32
  • 1
    +1 reflection allows me to get `this` as `arg$1` but not compare the method called. I may need to read the byte code to see if it is the same. – Peter Lawrey Jun 07 '14 at 10:42