Let me propose a simple rule:
A Stream
that is passed as a method argument or returned as a method's return value must be the tail of an unterminated pipeline.
This is probably so obvious to those of us who have worked on streams that we never bothered to write it down. But it's probably not obvious to people approaching streams for the first time, so it's likely worth a discussion.
The main rule is covered in the Streams API package documentation: a stream can have at most one terminal operation. Once it's been terminated, it's illegal to add any intermediate or terminal operations.
The other rule is that stream pipelines must be linear; they cannot have branches. This isn't terribly clearly documented, but it is mentioned in the Stream class documentation about two-thirds of the way down. This means that it's illegal to add an intermediate or terminal operation to a stream if it isn't the last operation on the pipeline.
Most of the stream methods are either intermediate or terminal operations. If you attempt to use one of these on a stream that's terminated or that's not the last operation, you find out pretty quickly by getting an IllegalArgumentException
. This does happen occasionally, but I think that once people get the idea that a pipeline has to be linear, they learn to avoid this issue, and the problem goes away. I think this is pretty easy for most people to grasp; it shouldn't require a paradigm shift.
Once you understand this, it's clear that if you're going to hand a Stream
instance to another piece of code -- either by passing it as an argument, or returning it to the caller -- it needs to be a stream source or the last intermediate operation in a pipeline. That is, it needs to be the tail of an unterminated pipeline.
To put in other words: it seems to me that if an API returns a stream, the general mindset should be that all interaction with it must terminate in the immediate context. It should be forbidden to pass the stream around.
I think this is too restrictive. As long as you adhere to the rule I proposed, you should be free to pass the stream around as much as you want. Indeed, there are a bunch of use cases for getting a stream from somewhere, modifying it, and passing it along. Here are a couple examples.
1) Open a text file containing the textual representation of a POJO on each line. Call File.lines()
to get a Stream<String>
. Map each line into a POJO instance, and return a Stream<POJO>
to the caller. The caller might apply a filter or a sort operation and return the stream to its caller.
2) Given a Stream<POJO>
, you might want to have a web interface to allow the user to provide a complex set of search criteria. (For example, consider a shopping site with lots of sorting and filtering options.) Instead of composing a big complex pipeline in code, you might have a method like the following:
Stream<POJO> applyCriteria(Stream<POJO>, SearchCriteria)
which would take a stream, apply the search criteria by appending various filters, and possibly sort or distinct operations, and return the resulting stream to the caller.
From these examples, I hope you can see that there is considerable flexibility in passing streams around, as long as what you pass around is always the tail of an unterminated pipeline.