I have a sequential data source represented as simple Iterator (or Stream). The data is pretty big and don’t fit the memory. Also the source is traversable once and has heavy cost to fetch. This source is used in some heavy procedure (black-box) that takes Iterator (or Stream) as its argument to consume data linear. Ok, that’s simple. But what can I do if I have two different such consuming procedures?? As I told, I don’t want to suck the input data into collection like List. I can also do my task by re-read the source twice from its very begin but I don't like this because it isn't effective. If fact I need to “tee” (kind of clone) the Iterator (or Stream) to consume the single one twice by two parallel processes without caching it into memory collection. I suppose such approach should do back-pressure or rather throttling the sibling(s) if it consumes the source stream too fast. The effective solution should perhaps have some parallel-safe queue buffer. Does anyone know how to undertake such thing on Scala (or using any external stream libraries/frameworks)?
PS I found a 4 years old similar question: One upstream stream feeding multiple downstream streams The difference is that I ask how to perform it using standard Scala Iterators (or Streams) or better some existing library.