54

I just took some time to start looking into the java-8 buzz about streams and lambdas. What surprised me is that you cannot apply the Stream operations, like .map(), .filter() directly on a java.util.Collection. Is there a technical reason why the java.util.Collection interface was not extended with default implementations of these Stream operations?

Googling a bit, I see lots of examples of people coding along the pattern of:

List<String> list = someListExpression;
List<String> anotherList = list.stream().map(x -> f(x)).collect(Collectors.toList());

which becomes very clumsy, if you have a lot of these stream-operations in your code. Since .stream() and .collect() are completely irrelevant to what you want to express, you would rather like to say:

List<String> list = someListExpression;
List<String> anotherList = list.map(x -> f(x));
John Kugelman
  • 349,597
  • 67
  • 533
  • 578
Rop
  • 3,359
  • 3
  • 38
  • 59

1 Answers1

101

Yes, there are excellent reasons for these decisions :)

The key is the difference between eager and lazy operations. The examples you give under the first question show eager operations where mapping or filtering a list produces a new list. There's nothing wrong with this, but it is often not what you want, because you're often doing way more work than you need; an eager operation must operate on every element, and produce a new collection. If you're composing multiple operations (filter-map-reduce), you're doing a lot of extra work. On the other hand, lazy operations compose beautifully; if you do:

Optional<Person> tallestGuy = people.stream()
                                    .filter(p -> p.getGender() == MALE)
                                    .max(comparing(Person::getHeight));

the filter and reduce (max) operations are fused together into a single pass. This is very efficient.

So, why not expose the Stream methods right on List? Well, we tried it like that. Among numerous other reasons, we found that mixing lazy methods like filter() and eager methods like removeAll() was confusing to users. By grouping the lazy methods into a separate abstraction, it becomes much clearer; the methods on List are those that mutate the list; the methods on Stream are those that deal in composible, lazy operations on data sequences regardless of where that data lives.

So, the way you suggest it is great if you want to do really simple things, but starts to fall apart when you try to build on it. Is the extra stream() method annoying? Sure. But keeping the abstractions for data structures (which are largely about organizing data in memory) and streams (which are largely about composing aggregate behavior) separate scales better to more sophisticated operations.

To your second question, you can do this relatively easily: implement the stream methods like this:

public<U> Stream<U> map(Function<T,U> mapper) { return convertToStream().map(mapper); }

But that's just swimming against the tide; better to just implement an efficient stream() method.

Naman
  • 27,789
  • 26
  • 218
  • 353
Brian Goetz
  • 90,105
  • 23
  • 150
  • 161
  • 4
    Thanks Brian -- great clarification!! Still -- from a pragmatic viewpoint -- for all the daily run-of-the-mill small and simple dev-cases where you don't really care abt neither laziness nor parallelism, I think it would be of considerable value to have a more compact, generally adopted way/api of applying (eager versions of) "streams operations" on an ordinary List... – Rop Jun 29 '14 at 10:43
  • 6
    Yes, we might in the future consider additional_differently named_ bulk eager operations on List. For example, we added `replaceAll` and `removeIf` in 8, and we might add more. But we would want to steer clear of the names used in `Stream`, to avoid confusion. – Brian Goetz Jun 29 '14 at 18:36
  • Would there be options to have compiler inference on the `.collect(Collections.toList())` method at the end of the stream operations, it would help a lot. Yet it would need a very very careful implementation if possible. It might even need language support of the `=` operator and then also the ability to offer different overloads? I'm afraid it is asked too much for Java? – skiwi Jun 30 '14 at 09:11
  • @BrianGoetz Basically asking whether `List newList = oldList.stream();` would be possible, so omitting the `.collect(toList())`. – skiwi Jun 30 '14 at 14:21
  • 16
    That said, I'd put in a vote for convenience methods for common collectors: `Stream::toList`, for instance. But real reason I'm commenting is to thank Brian for answering these and other design questions on SO! – yshavit Jul 02 '14 at 12:40
  • Difference between eager and lazy operations? How to know whether a given operation is eager or lazy? – Rafael Eyng Mar 24 '15 at 16:54
  • @RafaelEyng what I like about Java is that all info you need is basically included into sources as javadocs) – Askar Kalykov Apr 03 '15 at 05:46