12

I have been playing with the new and shiny functional part of Java and one of the things that puzzle me the most are streams?

What is their use?

On Google I mostly found explanations of how to use them and practical examples, which I already got down, nothing concrete about the magic behind the scenes, which is what interests me.

I dont mean in a practical sense, coming from a few functional languages i figured out map/filter/reduce/etc. fairly quickly but why do we need to convert to a stream first? Java already has iterators. Is there a fundamental difference between stream and iterator like one being lazy and the other not? Or is it something else?

Bottom line: what is the fundamental difference between iterators and streams and what functionality couldn't be implemented as an extension to iterators and needed a whole new family of types?

Michele Dorigatti
  • 811
  • 1
  • 9
  • 17
Rares Dima
  • 1,575
  • 1
  • 15
  • 38
  • Does this answer your question? [Iterator versus Stream of Java 8](https://stackoverflow.com/questions/31210791/iterator-versus-stream-of-java-8) – Miguel Gamboa Jun 05 '20 at 11:42

5 Answers5

14

Talking about streams, in general, is a vast topic. However, I will derive about why you should favour the streams API over Iterators.

First and foremost, with the stream API, we can now program at a much higher level of abstraction, just like SQL queries, i.e. we express what we want and let the library handle the rest.

Second, stream operations perform their iterations behind the scenes (internal iteration) , this means the processing of the data could be done in parallel or in a different order that may be more optimized.

On the other hand, if you decide to explicitly iterate over your collection to perform some computation whether that's with an Iterator or the syntactic sugar for an iterator (the enhanced for loop) then you're explicitly taking the items in the collection and processing them one by one thus it's inherently serial.

Using iterators instead of the stream API also means a lot more work has to be done when you want to go parallel or find different ways to optimise your program.

Yet, this also means that you're spending much more time dealing with the low-level details instead of just focusing on what you want your program to do.

Also mentioned in the Java-8 in Action book:

The internal iteration in the Streams library can automatically choose a data representation and implementation of parallelism to match your hardware. By contrast, once you’ve chosen external iteration by writing for-each, then you’ve essentially committed to self-manage any parallelism. (Self-managing in practice means either “one fine day we’ll parallelize this” or “starting the long and arduous battle involving tasks and synchronized”.)

Java 8 needed an interface like Collection but without iterators, ergo Stream!

Essentially, with the stream API, your life is much easier in many ways but what I find most useful is the fact that you can now put more time into focusing on what you want your code to do and at the same time you can decide to go parallel without dealing with the low-level stuff.

This is of course not saying to always utilise streams wherever/whenever possible. Rather it's stating the benefits of using streams over Iterators.

There are certain places where it will be more appropriate to use Iterators rather than the stream API and vice versa. So choose wisely which approach to proceed with in terms of processing data in collections.

Ousmane D.
  • 54,915
  • 8
  • 91
  • 126
5

Adding Stream's methods to an existing Iterator was definitely a possibility, because default implementations could be provided for all the additional methods, but this API change comes with significant drawbacks:

  • Your streams become "married" to iterators, even in situations when you don't want an iterator (e.g. a generator stream)
  • You do not get streams from collections "for free" - in the same way that you call stream you would need to call iterator (which is already the case for enhanced loops using iterators)
  • You still require a great deal of new types for primitive streams, because there is no similar concept in iterators. That's the functionality that would be hard to "graft" onto iterators.

Overall gain from merging streams with iterators didn't look like much, so it looks like the API designers went for a clean sheet approach.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • 2
    The generator stream example is genius. I love the concept of generator but it never crossed my mind when i was thinking about streams! – Rares Dima Feb 18 '18 at 09:47
5

Is there a fundamental difference between stream and iterator like one being lazy and the other not? Or is it something else?

Yes, the fundamental difference is that streams are processed internally. What we say when we run a stream is that we want all of this stuff, to filter into this stuff, on this conditional, and give us this result. We don't as a matter of course say anything about how we want that to happen. This means the same source code later on could be run parallelized on a graphics card or in any yet unknown way. We just want this to happen.

There's a lot of interesting things that can happen behind the scenes if we, as programmers, are explicit on the criteria we don't care about. This is also a lot of the oomph behind the functional interfaces and some of the lambda expressions. The idea is that if we say we don't care initially, then the compilers can solve that stuff in anyway that solves it rather than how the program said to solve it. Sometimes different computer arrangements can solve things better in a different way like better parallelization.

Bottom line: what is the fundamental difference between iterators and streams and what functionality couldn't be implemented as an extension to iterators and needed a whole new family of types?

Iterators said how the problem needed to be solved. It needed to do this element then this element then this element and the compiler can't know if there's some deep and seemingly hidden reason for that rather than some other way. Streams says you don't care, iterate forwards backwards, on a thousand different processors, on the GPU, it doesn't matter.

I want every element processed in this way. I want one element after another processed in this way. The latter one is actually needlessly restrictive.

Tatarize
  • 10,238
  • 4
  • 58
  • 64
4

In my mind, Java 8's streams are conceptually very much like Unix's pipes. You start with a certain set of data that you filter, manipulate, or perform actions upon, until you achieve the exact result that you like. The resulting code is also much less verbose than would be possible using traditional constructs. Others have already mentioned that streams are about the what as opposed to the how.

One particular example that I used my job is when scraping web sites. The JSoup library exposes the result of a CSS query as a type that implements Collection. Now, if you use it in Java 8, you get streams thrown in for free.

This means that you can select certain tags using a CSS query, filter out certain tags that you're not interested in, convert them into some object and stuff them in a list: all this in just a handful of lines.

Its entirely possible to do this in Java 7,but you need to declare the list, iterate through the tags, have a conditional statement, instantiate the object, add it to the list... It's easily possible to have three times as many lines.

The advantages of not having deal with the implementation are all valid and correct, but in a more business oriented approach, it makes your code a lot more readable, and therefore easier to maintain, including by other people. Bosses like that.

The flip side is that, because of Java's conservative approach to new features, it is a lot of syntactic sugar that hides the implementation, until the moment that you get an exception stack trace, and you wish you'd studied English literature instead of software engineering.

SeverityOne
  • 2,476
  • 12
  • 25
1

On top of the other answers I'll throw in a more cynical answer also. For the most part it has to do with amount of typing the programmers needs to do and how concise the code looks. A lot of languages out there support lambdas and streams already. And the people who write in these languages say stuff like "Java sucks because you gotta write all this code to just process all the items in a list. My language supports functional programming and that's why it is better than Java." Java does not need Streams or lambdas, it has been doing all right without them. But it needs to stay competitive. There are a lot of Java programmers out there and we do not like our language being dragged in the mud. And I agree that while it did not need Streams, they sure are a lot of fun to write in. Streams, in the end, leads to less typing and you get your job done a lot faster. Its slick, and pretty.

There is an issue with Streams and lambdas that may require iterating every once in a while. And that is that lambdas work with final objects and final primitives, that is the objects and primitives that are outside of the lambda must be final. For the most part you maybe be able to get around this, but every once in a while you might just revert to iterating because it is just easier. Don't worry about this, when it happens you will see it because your code won't compile.

Jose Martinez
  • 11,452
  • 7
  • 53
  • 68
  • 5
    I don't necessarily agree that less typing leads to faster coding. Coding, more than anything else, is a matter of *thinking*, and the sped of that is irrespective of how fast and how much you type. But lambda/streams code is much more readable, and *that* is a very important consideration. Because the speed of writing is largely irrelevant, but the speed and ease of *reading* is essential. – SeverityOne Feb 18 '18 at 08:58
  • I disagree, all things the same, less typing is faster coding. – Jose Martinez Feb 18 '18 at 12:54
  • 2
    I'd like to see some substantiation of that. With my 3.5 decades of programming experience, I find that typing a few characters more or less has approximately 0% influence on your total output. Unless you write 100% correct code every time, which you don't. It's often used as an argument in favour of for example Kotlin, but it's always a perception, never anything remotely supported by evidence. – SeverityOne Feb 18 '18 at 17:13
  • Well you can't argue against the fact that it takes time to type so more typing means more time taken. Simple math. You are not arguing me, you are arguing simple logic and physics. – Jose Martinez Feb 18 '18 at 17:18
  • 3
    Of course I can argue that, I've been writing software for the past 35 years. My personal experience is that you spend a lot more time thinking, designing and, yes, debugging, than you do typing. But feel free to point me to some scientific studies that support your claim. – SeverityOne Feb 18 '18 at 21:37
  • The point is that a reduction in typing makes coding faster, period. Yes coding has other activities that take up time. I also have coded for years, no need to keep mentioning it. You said "I don't necessarily agree that less typing leads to faster coding", that is what I am refuting. Less typing does make coding faster. That's why Java is coming out with features to address this. I was at Java One and it was a serious concern for the future of Java. The authors know that they need to attract newly minted programmers and reducing "ceremony", aka amount of typing, is key. – Jose Martinez Feb 18 '18 at 22:16
  • "Well you can't argue against the fact that it takes time to type so more typing means more time taken. Simple math. You are not arguing me, you are arguing simple logic and physics".. Maths/Physics/Logic ... Always depends on the expanse of the universal set - how many variables you willing to consider. It takes me longer to type `!s£DD$%gOlH&8*` than it takes me to type `kkkkkkkkkkkkkkkkkkkkkk` which has more characters. So even strictly speaking, more typing != more time taken :D (which is why the 1st preference usually is to try find some home-keys for key bindings/shortcuts in vim etc.) – KnowSQL May 17 '22 at 20:56