3

When using Java 8 streams I often find that I need to refactor a multi-statement lambda expression. I will illustrate this with a simple example. Assume I have started writing this code:

Stream.of(1, 3).map(i -> {
    if (i == 1) {
        return "I";
    } else if (i == 3) {
        return "E";
    }
    return "";
}).forEach(System.out::println);

Now I am not very fond of the large lambda expression in the map call. Hence, I want to refactor it out of there. I see two options, either I make an instance of Function in my class:

private static Function<Integer, String> mapper = i -> {
    if (i == 1) {
        return "I";
    } else if (i == 3) {
        return "E";
    }
    return "";
};

and use it like this:

Stream.of(1, 3).map(mapper).forEach(System.out::println);

Or I simply make a method:

private static String map(Integer i) {
    if (i == 1) {
        return "I";
    } else if (i == 3) {
        return "E";
    }
    return "";
}

and use a method reference:

Stream.of(1, 3).map(Test::map).forEach(System.out::println);

Apart from the obvious matter of taste, are there any advantages or drawbacks to either approach?

For example, I know the stack traces become more readable in the method reference case, which is a small advantage.

Tagir Valeev
  • 97,161
  • 19
  • 222
  • 334
K Erlandsson
  • 13,408
  • 6
  • 51
  • 67
  • 3
    I would consider also `.map(i -> i == 1 ? "I" : i == 3 ? "E" : "")` – Tagir Valeev Aug 31 '15 at 11:39
  • 1
    The current impl will transform your lambda to a private method anyway. – Marko Topolnik Aug 31 '15 at 11:41
  • @TagirValeev agreed, in this simple case, but I often run into this in more complex cases where an external method is the only option. – K Erlandsson Aug 31 '15 at 11:41
  • 3
    I would personally prefer the method reference over the method returning a `Function` - I see the former as neater and more "functional" - the latter is very much pre-Java 8. I have a feeling that the method reference can be tidied to an `invokeDynamic` likely making it more efficient - but I don't have an references to back this up. – Boris the Spider Aug 31 '15 at 11:42

2 Answers2

6

Unless there's some additional magic I'm unaware of, the current lambda implementation will desugar your non-capturing lambda into a static method and will cache the lambda instance. By doing the same thing explicitly (a static final reference to a lambda), you're basically duplicating that implicit work, so you're ending up with two cached references to the same thing. You are also defeating the lazy initialization of the lambda instance, which you'd otherwise be getting for free.

This is why I would prefer just the method reference: it's simpler to write, more idiomatic, and also seems to be more lightweight in terms of implementation.

Marko Topolnik
  • 195,646
  • 29
  • 319
  • 436
  • 3
    I would add that `static final` will be initialized even if you don't call this particular method which contains your lambda. This effectively means that additional anonymous class will be generated by `LambdaMetafactory` even if you never use it. Using method-reference or in-place lambda this class will be generated only when it's actually necessary. – Tagir Valeev Aug 31 '15 at 11:55
  • 1
    Well, you’re not ending up having two `static` fields as the reusing of the lambda implementation instance doesn’t work using fields, but besides that, it’s correct, as discussed in [“Is method reference caching a good idea in Java 8?”](http://stackoverflow.com/q/23983832/2711488). What I also like to emphasize is that in the case the implementation does not cache an instance, it likely has a reason for this decision which manual caching counteracts. So usually, letting the JRE decide is the best choice. – Holger Aug 31 '15 at 12:06
  • @Holger You mean, you've checked specifically this code? So there actually is a mechanism which I refer to as "additional magic I'm unaware of"? BTW the reason could be precisely the fact that the caching is done manually. – Marko Topolnik Aug 31 '15 at 12:07
  • More generally, the lambda creation code is guaranteed to execute only once, this is very easy to know. – Marko Topolnik Aug 31 '15 at 12:13
  • @Holger No, I mean the part where you found out that it isn't cached. But now I see I misread: you didn't write "in this case", but "in _the_ case". – Marko Topolnik Aug 31 '15 at 12:15
  • 2
    Ok. By the way the bootstrapping *might* run concurrently. It’s only guaranteed that at most one result will be used afterwards, so in case of a concurrent generation, all other results get dropped. But afaik, in Oracle’s implementation the factory itself will lock explicitly to avoid doing work twice in the concurrent case. – Holger Aug 31 '15 at 12:19
  • @Holger `indy` will always remember the result---that's understood. But the result is a lambda factory instance, right? And that factory holds the cached instance of the stateless lambda? – Marko Topolnik Aug 31 '15 at 12:22
  • 2
    The result is a `CallSite` instance holding a `MethodHandle`. In the case of a stateless lambda, it’s a [constant handle](http://docs.oracle.com/javase/8/docs/api/java/lang/invoke/MethodHandles.html#constant-java.lang.Class-java.lang.Object-) holding a constructed lambda instance. Otherwise, it will be a handle pointing to the constructor of the lambda implementation class (or a `static` factory method of that class, I don’t know why they decided to do it that way). Either way, subsequent executions of the `invokedynamic` instruction behave like executing the `MethodHandle`. – Holger Aug 31 '15 at 12:31
5

Some of the reasons to refactor the multi-line lambda expression into an ordinary method, and to use a method reference in the stream, are maintainability and testability.

When the non-trivial mapper function is turned into an ordinary method, it acquires a name, and it becomes available to unit testing frameworks. You can easily write tests that call it directly, and it can be stubbed out if necessary.

The method can also have documentation comments associated with it, including documentation for parameters and return values.

Both of these are quite useful if the mapper function is complicated.

It's not wrong to assign a lambda expression to a field, but I think the main reason for doing so would be if the field is modified at runtime, e.g., when the state of the application changes. Even in these cases I might consider writing ordinary methods and then assigning the field using a method reference instead of a multi-line lambda expression.

One drawback of using a field is that it requires you to declare explicitly the generic type parameters for the functional interface. IDEs can help with this, but it adds clutter to the program.

I would guess that an ordinary method with a method reference is always preferable to using a final field initialized with a lambda expression.

Stuart Marks
  • 127,867
  • 37
  • 205
  • 259
  • I was almost going to mention the opposite: the only good reason to have a static lambda field are the explicit type parameters---in a case where you run into issues with type inference. – Marko Topolnik Aug 31 '15 at 16:07
  • @MarkoTopolnik I guess that's a valid use of a static lambda field, but usually to help type inference I use a type witness or a cast. – Stuart Marks Aug 31 '15 at 16:36
  • Yes, but if you had a lot of use sites where you'd have to use the same, that would be a case for a global variable. It's probably very rare, though. – Marko Topolnik Aug 31 '15 at 16:38