13

When using Java 8 Stream API, is there a benefit to combining multiple map calls into one, or does it not really affect performance?

For example:

stream.map(SomeClass::operation1).map(SomeClass::operation2);

versus

stream.map(o -> o.operation1().operation2());
Mike
  • 1,791
  • 1
  • 17
  • 23
  • 2
    I wouldn't expect this to make a significant difference. Do whatever you find most readable. – Louis Wasserman Jan 27 '16 at 21:14
  • 9
    Honestly, I think the cost of your methods `operation1` and `operation2` will be _much_ more important than this even if there's a difference. So, write working code, then benchmark, that's the only way to know for sure. – Tunaki Jan 27 '16 at 21:14
  • 3
    I guess if dealing with parallel streams, example 1 could be parallelized on operation1, and the streamed output parallelized again for operation2, but example 2 would only have one level of parallelization? – Glenn Jan 27 '16 at 21:27
  • Since the streams are immutable, wouldn't example 1 cause a new list with references of the objects in the stream to be created, which could potentially hog up more memory? – deepmindz Jan 28 '16 at 03:05
  • 2
    Related: http://stackoverflow.com/q/31058755/4856258 – Tagir Valeev Jan 28 '16 at 03:20
  • 2
    @stanfordude, of course not. – Tagir Valeev Jan 28 '16 at 03:21
  • 2
    @Glenn, no it does not work this way. Parallel stream processes every input element at a whole without splitting the processing to different threads. It just sends some input elements to other threads. – Tagir Valeev Jan 28 '16 at 03:22
  • 1
    http://stackoverflow.com/q/24054773/2711488 – Holger Jan 28 '16 at 13:15
  • @Glenn: this is a theoretical possibility, which isn’t used in the current implementation and won’t, given the current CPU architecture. It would only pay off with massive parallel computation engines and much lower synchronization cost (say, when using GPU computing). Once, this becomes relevant, I’d expect HotSpot (or it’s successor) to be able to dissolve the code of a single method/lambda expression for parallel execution as well. Currently, the processed data is split, not the operations (as Tagir already pointed out) – Holger Jan 28 '16 at 13:21
  • 1
    Does this answer your question? [Using multiple map functions vs. a block statement in a map in a java stream](https://stackoverflow.com/questions/31058755/using-multiple-map-functions-vs-a-block-statement-in-a-map-in-a-java-stream) – Faiz Kidwai Jan 04 '20 at 07:29

1 Answers1

8

The performance overhead here is negligible for most business-logic operations. You have two additional method calls in the pipeline (which may not be inlined by JIT-compiler in real application). Also you have longer call stack (by one frame), so if you have an exception inside stream operation, its creation would be a little bit slower. These things might be significant if your stream performs really low-level operations like simple math. However most of the real problems have much bigger computational cost, so relative performance drop is unlikely to be noticeable. And if you actually perform a simple math and need the performance, it's better to stick with plain old for loops instead. Use the version you find more readable and do not perform the premature optimization.

Tagir Valeev
  • 97,161
  • 19
  • 222
  • 334