0

Java 8 added functional programming constructs, including the Function class and its associated identity() method.

Here's the current structure of this method:

// Current implementation of this function in the [JDK source][1]
static <T> Function<T, T> identity() {
    return t -> t;
}

// Can be used like this
List<T> sameList = list.stream().map(Function.identity()).collect(Collectors.toList());

However, there's a second way to structure it:

// Alternative implementation of the method
static <T> T identity(T in) {
    return in;
}

// Can be used like this
List<T> sameList = list.stream().map(Function::identity).collect(Collectors.toList());

There's even a third way to structure it:

// Third implementation
static final Function<T, T> IDENTITY_FUNCTION = t -> t;

// Can be used like this
List<T> sameList = list.stream().map(Function.IDENTITY_FUNCTION).collect(Collectors.toList());

Of the three approaches, the first one that is actually used in the JDK looks less memory efficient, as it appears to be creating a new object (lambda) on every use, while the second and third implementations don't. According to this SO answer that's not actually the case, so ultimately all three approaches seem relatively equivalent performance-wise.

Using the second approach allows the method to be used as a method reference, which is similar to how many other standard library methods are used in functional constructs. E.g. stream.map(Math::abs) or stream.map(String::toLowerCase).

Overall, why use the first approach, which looks (though ultimately isn't) less performant and is different from other examples?

Mark
  • 1,746
  • 2
  • 18
  • 19
  • @AndyTurner I don't think they're saying that those are the same, but rather thay for the given implementation (identity as a function with signature `T(T)`), that would be the way to use it. – FallenWarrior Jan 17 '20 at 23:48
  • 2
    Your title doesn't agree with your question. – user207421 Jan 17 '20 at 23:59
  • 5
    Did you _try_ your "third implementation"? – chrylis -cautiouslyoptimistic- Jan 18 '20 at 00:30
  • 2
    `static int Math.abs(int)` and `String String::toLowerCase()` predate lambdas, so it couldn't be changed, besides those methods can also be called directly, instead of only as method references. In contrast, ` T Function::identity(T in)` would be new, and nobody would ever call it directly (what would be the point?). – Andreas Jan 18 '20 at 01:23
  • You kind of mentioned style used for streams so it may be worth pointing that despite not being specified anywhere officially, we are *encouraged* to use static imports for methods which *provide* implementations of functional interfaces. So instead of `Collectors.toList()` we could write `toList()`, instead of `Comparator.compare(..)` `compare(...)`. This allows streams to be written like `.map(String::length).collect(groupingBy(identity(), counting()));`. In that example IMO simple `identity()` is simpler than `Function::identity` which you mentioned. – Pshemo Jan 19 '20 at 14:26

1 Answers1

8

TL;DR Using Function.identity() creates only one object, so it's very memory efficient.


Third implementation doesn't compile, because T is undefined, so that's not an option.

In second implementation, every time you write Function::identity a new object instance is created.

In first implementation, whenever you call Function.identity(), an instance to the same lambda object is returned.

It is simple to see for yourself. Start by creating the two identity methods in the same class, so rename them to identity1 and identity2 to keep them separately identifiable.

static <T> Function<T, T> identity1() {
    return t -> t;
}

static <T> T identity2(T in) {
    return in;
}

Write a test method that accepts a Function and prints the object, so we can see it's unique identity, as reflected by the hash code.

static <A, B> void test(Function<A, B> func) {
    System.out.println(func);
}

Call the test method repeatedly to see if each one gets a new object instance or not (my code is in a class named Test).

test(Test.identity1());
test(Test.identity1());
test(Test.identity1());
test(Test::identity2);
test(Test::identity2);
for (int i = 0; i < 3; i++)
    test(Test::identity2);

Output

Test$$Lambda$1/0x0000000800ba0840@7adf9f5f
Test$$Lambda$1/0x0000000800ba0840@7adf9f5f
Test$$Lambda$1/0x0000000800ba0840@7adf9f5f
Test$$Lambda$2/0x0000000800ba1040@5674cd4d
Test$$Lambda$3/0x0000000800ba1440@65b54208
Test$$Lambda$4/0x0000000800ba1840@6b884d57
Test$$Lambda$4/0x0000000800ba1840@6b884d57
Test$$Lambda$4/0x0000000800ba1840@6b884d57

As you can see, multiple statements calling Test.identity1() all get the same object, but multiple statements using Test::identity2 all get different objects.

It is true that repeated executions of the same statement gets the same object (as seen in result from the loop), but that's different from result obtained from different statements.

Conclusion: Using Test.identity1() creates only one object, so it's more memory efficient than using Test::identity2.

Andreas
  • 154,647
  • 11
  • 152
  • 247
  • Thanks! Do you know why the method reference creates a new object each time? And is that required by the language spec or is it an artifact of Oracle's JVM (or whatever JVM you were testing on)? – Mark Jan 19 '20 at 01:02
  • 1
    @Mark I don't think there is anything in the spec about that. The fact that a particular written method reference creates a single reusable lambda object on first use is a JVM implementation decision. The implementation of `Function.identity()` in Oracles API simply relies on that. Other implementations might not have the JVM do it, in which case they might implement `Function.identity()` to do such memoization itself. – Andreas Jan 19 '20 at 05:23
  • 1
    That’s explained in [Does a lambda expression create an object on the heap every time it's executed?](https://stackoverflow.com/a/27524543/2711488), but as said in [Is method reference caching a good idea in Java 8?](https://stackoverflow.com/a/23991339/2711488), if a particular implementation does not reuse the object produced for a stateless lambda, it must have a reason for that and we shouldn’t counteract, in other words, a `Function.identity()` doing memoization itself would be a questionable thing. – Holger Jan 20 '20 at 12:55
  • 2
    As a fun fact, [`Objects.isNull(Object)`](https://docs.oracle.com/javase/8/docs/api/java/util/Objects.html#isNull-java.lang.Object-) does not follow the pattern of `Function.identity()`, but rather is intended to be used as `Object::isNull`. It has been added in the same release and its documentation even says explicitly that it intended to be used with a method reference, like the OP’s second variant. – Holger Jan 20 '20 at 13:06