216

I was asked this at an interview and I'm not convinced I gave the best answer I could have. I mentioned that you can do a parallel search and that null values were handled by some means I couldn't remember. Now I realize I was thinking of Optionals. What am I missing here? They claim it's better or more concise code but I'm not sure I agree.


Considering how succinctly it was answered, it seems that this wasn't too broad a question after all.


If they are asking this question at interviews, and clearly they are, what purpose could breaking it down serve other than to make it harder to find an answer? I mean, what are you looking for? I could break down the question and have all the sub-questions answered but then create a parent question with links to all the subquestions... seems pretty silly though. While we are at it, please give me an example of a less broad question. I know of no way to ask only part of this question and still get a meaningful answer. I could ask exactly the same question in a different way. For example, I could ask "What purpose do streams serve?" or "When would I use a stream instead of a for loop?" or "Why bother with streams instead of for loops?" These are all exactly the same question though.

...or is it considered too broad because someone gave a really long multi-point answer? Frankly anyone in the know could do that with virtually any question. If you happen to be one of the authors of the JVM, for example, you could probably talk about for loops all day long when most of us couldn't.

"Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question."

As noted below, an adequate answer has been given which proves that there is one and that it is easy enough to provide.

user447607
  • 5,149
  • 13
  • 33
  • 55
  • 12
    This is imho opinion-based. Personnally, I prefer streams because it makes the code more readable. It allows to write *what* you want instead of *how*. Moreover, it is totally badass to do amazing things with one-liners. – Arnaud Denoyelle May 25 '17 at 12:04
  • 35
    Even if it's a 30 line, one liner? I'm not fond of long chains. – user447607 May 25 '17 at 12:04
  • 2
    Besides, all I'm looking for here is the appropriate response for an interview. This is the only "opinion" that matters. – user447607 May 25 '17 at 12:06
  • @ArnaudDenoyelle I would agree on the *badass* thing, but if we talk about performance benchmarks, which one would be more optimal ? – dumbPotato21 May 25 '17 at 12:06
  • @ChandlerBing performance might not be a priority. – slim May 25 '17 at 12:07
  • 2
    Educationally speaking, this question saved me some degradation at a future interview too, @slim really nailed it, but Industrially speaking, it also speaks of how Microsoft programming languages built their careers on ripping off java language, and finally java gets it revenge by ripping off the **Lambda Expression and streams** from the opponents, lets see what java is going to do about Structs and Unions in the future :) – ShayHaned May 25 '17 at 13:02
  • 5
    Note that streams only tap a fraction of the power in functional programming :-/ – Thorbjørn Ravn Andersen May 25 '17 at 17:07
  • 1
    @ShayHaned we'll see that if/when Java gets actual type inference half as good as Roslyn's. – Mathieu Guindon May 25 '17 at 22:52
  • 1
    I don't think I can really make a whole answer out of this, at least not for this question, but streams use internal iteration, so they don't have to use a loop. People are quick to point to parallelism, but streams can also be implemented with recursion which is very useful for e.g. tree traversal. – Radiodef May 26 '17 at 01:53
  • @Mat's Mug , I strictly mean no offense to the **beauty and beast** within c# business, just trying to point out that java is yet missing the **beast**, maybe a day will come when java gets its type reference half as good as c#, because I can certainly imagine that they are working on the missing halves :) – ShayHaned May 26 '17 at 13:08

5 Answers5

387

Interesting that the interview question asks about the advantages, without asking about disadvantages, for there are are both.

Streams are a more declarative style. Or a more expressive style. It may be considered better to declare your intent in code, than to describe how it's done:

 return people
     .filter( p -> p.age() < 19)
     .collect(toList());

... says quite clearly that you're filtering matching elements from a list, whereas:

 List<Person> filtered = new ArrayList<>();
 for(Person p : people) {
     if(p.age() < 19) {
         filtered.add(p);
     }
 }
 return filtered;

Says "I'm doing a loop". The purpose of the loop is buried deeper in the logic.

Streams are often terser. The same example shows this. Terser isn't always better, but if you can be terse and expressive at the same time, so much the better.

Streams have a strong affinity with functions. Java 8 introduces lambdas and functional interfaces, which opens a whole toybox of powerful techniques. Streams provide the most convenient and natural way to apply functions to sequences of objects.

Streams encourage less mutability. This is sort of related to the functional programming aspect -- the kind of programs you write using streams tend to be the kind of programs where you don't modify objects.

Streams encourage looser coupling. Your stream-handling code doesn't need to know the source of the stream, or its eventual terminating method.

Streams can succinctly express quite sophisticated behaviour. For example:

 stream.filter(myfilter).findFirst();

Might look at first glance as if it filters the whole stream, then returns the first element. But in fact findFirst() drives the whole operation, so it efficiently stops after finding one item.

Streams provide scope for future efficiency gains. Some people have benchmarked and found that single-threaded streams from in-memory Lists or arrays can be slower than the equivalent loop. This is plausible because there are more objects and overheads in play.

But streams scale. As well as Java's built-in support for parallel stream operations, there are a few libraries for distributed map-reduce using Streams as the API, because the model fits.

Disadvantages?

Performance: A for loop through an array is extremely lightweight both in terms of heap and CPU usage. If raw speed and memory thriftiness is a priority, using a stream is worse.

Familiarity.The world is full of experienced procedural programmers, from many language backgrounds, for whom loops are familiar and streams are novel. In some environments, you want to write code that's familiar to that kind of person.

Cognitive overhead. Because of its declarative nature, and increased abstraction from what's happening underneath, you may need to build a new mental model of how code relates to execution. Actually you only need to do this when things go wrong, or if you need to deeply analyse performance or subtle bugs. When it "just works", it just works.

Debuggers are improving, but even now, when you're stepping through stream code in a debugger, it can be harder work than the equivalent loop, because a simple loop is very close to the variables and code locations that a traditional debugger works with.

slim
  • 40,215
  • 13
  • 94
  • 127
  • 4
    I think it'd be fair to neat that stream-like stuff is becoming much more common and now shows up in a lot of commonly used languages that aren't especially FP-oriented. – Casey May 25 '17 at 20:16
  • 21
    Given the pros and cons listed here, I think streams are NOT worth it for anything other than very simple uses(little logic if/then/else,not many nested calls or lambdas etc), in non-critifcal performance parts – Henrik Kjus Alstad Nov 21 '18 at 08:54
  • 14
    @HenrikKjusAlstad That is absolutely not the takeaway I intended to communicate. Streams are mature, powerful, expressive and completely appropriate for production-grade code. – slim Nov 26 '18 at 13:40
  • 5
    Oh, I didnt mean i wouldnt use it in production. But rather, I would default to old fashioned loops/ifs etc, rather than streams, especially if the resulting stream would look complex. I'm sure there are uses where a stream will beat loops and if's in clarity, but more often than not, I think it's "tied", or sometimes even the other way around. Thus, I'd put weight on the cognitive overhead argument for sticking to the old ways. – Henrik Kjus Alstad Nov 26 '18 at 13:55
  • @slim Good explanation, using stream, if we have multiple filter to apply performs multiple operation we end up adding multiple `stream().filter` which can be done using one single for loop. Will that be considered as disadvantage? – Pasupathi Rajamanickam Apr 30 '19 at 17:25
  • You could use `.filter(predicate1.and(predicate2))` or `.filter(predicate1).filter(predicate2)`. I would use the latter. I don't see why doing it within a loop is preferable. Why do you? – slim Apr 30 '19 at 21:47
  • @slim you say "The purpose of the loop is buried deeper in the logic.". But really you could just (and probably should) extract that loop as a method and called it appropriately and intention of code would be 100% clear. – lijep dam Jan 20 '20 at 08:24
  • 8
    @lijepdam - but you'd still have code that says "I am iterating over this list (see inside the loop to find out why)", when "iterating over the list" is not the core intent of the code. – slim Jan 20 '20 at 08:41
  • In the answer it mentions performance as a disadvantage. Does anyone have actual performance numbers to quantify the difference in performance? – Eric Aug 29 '22 at 16:07
  • Do streams make deobfuscation easier? If I want to publish my code, I will. But else, it should be added to the cons. – alex Jan 09 '23 at 23:39
35

Syntactic fun aside, Streams are designed to work with potentially infinitely large data sets, whereas arrays, Collections, and nearly every Java SE class which implements Iterable are entirely in memory.

A disadvantage of a Stream is that filters, mappings, etc., cannot throw checked exceptions. This makes a Stream a poor choice for, say, intermediate I/O operations.

VGR
  • 40,506
  • 4
  • 48
  • 63
  • 17
    Of course, you can loop over infinite sources too. – slim May 25 '17 at 14:38
  • 3
    But if the elements to process are persisted in a DB, how do you use Streams? A junior dev could be tempted to read all of them in a Collection just to use Streams. And that would be a disaster. – Lluis Martinez Nov 09 '17 at 19:13
  • 4
    @LluisMartinez a good DB client library will return something like `Stream` -- or it would be possible to write one's own `Stream` implementation wrapping DB result cursor operations. – slim Aug 31 '18 at 12:47
  • That Streams silently ignore exceptions is a bug I was recently bitten by. Unintuitive. – xxfelixxx Dec 18 '19 at 00:29
  • @xxfelixxx Streams do not silently ignore exceptions. Try running this: `Arrays.asList("test", null).stream().forEach(s -> System.out.println(s.length()));` – VGR Dec 18 '19 at 00:35
  • This is the roughly the code I had, where `getDataType` could throw a RuntimeException: `Stream> temperatureStream = allData.stream().filter(v -> v.dataType == getDataType(TEMPERATURE)).map(vv -> new Data<>(v.timestamp, v.temperature));` It happily returned an empty stream... – xxfelixxx Dec 18 '19 at 00:45
  • @xxfelixxx Did you mean that `getDataType` *did* throw a RuntimeException, or did you mean that it *could* throw a RuntimeException? All RuntimeExceptions are unchecked exceptions, which means the compiler doesn’t require a caller to catch them, but that does not mean they’re ignored. They still get thrown and they still propagate up the call stack. – VGR Dec 18 '19 at 00:53
  • I mean that it did throw a RuntimeException, but that exception was never propaged . This was with Java8. The function `getDataType` was basically a switch statement over an enum, so any unhandled values would trigger the RuntimeException. – xxfelixxx Dec 18 '19 at 01:51
  • 2
    @xxfelixxx You may want to ask a new question about it. That line of code alone doesn’t look like a problem, so make sure to include a [mre] so we can see your problem for ourselves. – VGR Dec 18 '19 at 02:23
  • 2
    @xxfelixxx that line of code doesn’t do anything. Since this never invokes the method, the method will not throw an exception. The Stream is not empty, it’s just unused. Like putting a loop into a method and then never calling the method. – Holger Apr 14 '21 at 07:41
11

I'd say its parallelization that is so easy to use. Try iterating over millions of entries in parallel with a for loop. We go to many cpus, not faster; so the easier it is to run in parallel the better, and with Streams this is a breeze.

What I like a lot is the verbosity they offer. It takes little time to understand what they actually do and produce as opposed of how they do it.

Eugene
  • 117,005
  • 15
  • 201
  • 306
8
  1. You realized incorrectly: parallel operations use Streams, not Optionals.

  2. You can define methods working with streams: taking them as parameters, returning them, etc. You can't define a method which takes a loop as a parameter. This allows a complicated stream operation once and using it many times. Note that Java has a drawback here: your methods have to be called as someMethod(stream) as opposed to stream's own stream.someMethod(), so mixing them complicates reading: try seeing the order of operations in

    myMethod2(myMethod(stream.transform(...)).filter(...))
    

    Many other languages (C#, Kotlin, Scala, etc) allow some form of "extension methods".

  3. Even when you only need sequential operations, and don't want to reuse them, so that you could use either streams or loops, simple operations on streams may correspond to quite complex changes in the loops.

Alexey Romanov
  • 167,066
  • 35
  • 309
  • 487
  • Explain 1. Isn't the Optional interface the means by which nulls are handled in chains? Regarding 3, that makes sense because with short circuited filters, the method will only be invoked for specified occurrences. Efficient. It makes sense that I could state that using them reduces the need to write additional code that will need to be tested etc. Upon review, I'm not sure what you mean by the sequential case in 2. – user447607 May 25 '17 at 12:10
  • 1. `Optional` is an alternative to `null`, but it doesn't have anything to do with parallel operations. Unless "Now I realize I was thinking of Optionals" in your question is only talking about `null` handling? – Alexey Romanov May 25 '17 at 12:26
  • I've changed the order of 2 and 3 and expanded both of them a bit. – Alexey Romanov May 25 '17 at 12:39
7

You loop over a sequence (array, collection, input, ...) because you want to apply some function to the elements of the sequence.

Streams give you the ability to compose functions on sequence elements and allow to implement most common functions (e.g. mapping, filtering, finding, sorting, collecting, ...) independent of a concrete case.

Therefore given some looping task in most cases you can express it with less code using Streams, i.e. you gain readability.

wero
  • 32,544
  • 3
  • 59
  • 84