1

Using a WCF service to receive a stream of data (inbound from the client) that can be very, very large, what is the most efficient way to perform two operations on the stream "at once"? I realize the question is broad. Examples of the type of operation might include

but the key abstract point is that both operations require some kind of read operation, and the stream is not seekable (which, as I understand it, means I have to copy the stream if the operations are performed sequentially).

EDIT: This answer seems relevant too.

Community
  • 1
  • 1
downwitch
  • 1,362
  • 3
  • 19
  • 40
  • How large is "very very large?" Too big to fit in memory? Do both operations take approximately the same amount of time? Or could one be reading far ahead of the other? – Jim Mischel Apr 25 '14 at 19:18
  • @JimMischel Hundreds of MB? Don't really know the upper limit yet, but not too big for memory. My understanding is that buffered reading of large streams is inefficient, but my understanding of these operations is pretty rudimentary, which is why I'm looking for expertise. As far as concurrency, I'm asking in part if "reading the same chunk" can be used for more than one "write"/convert operation simultaneously, so I guess that means approximately the same amount of time. – downwitch Apr 25 '14 at 19:23

1 Answers1

1

Read a buffer at a time and pass it to the two consumers. There's nothing in your question that would prevent this simple solution from being used. That would look like this:

while(dataAvailable) {
 var buffer = Read();
 Write1(buffer);
 Write2(buffer);
}

And a practical example.

You can also play with wrapper streams that perform a side-effect (such as hashing) and just pass on the buffer to the next stream.

It becomes more complicated if you have multiple pieces of code requiring reads from a stream (such as two independent XmlReader's). In that case you need to demultiplex. You probably need to keep a buffer of data and only when all consumers have read that buffer you load the next buffer. This would involve threading and synchronization because multiple independent readers need to read in lock-step.

Community
  • 1
  • 1
usr
  • 168,620
  • 35
  • 240
  • 369
  • Can you provide some specific details on "reading a buffer at a time" (do you mean something like this http://stackoverflow.com/a/221941/409856?) and what a "wrapper stream" might look like? – downwitch Apr 25 '14 at 19:36
  • I added an example. I couldn't find one for a wrapper stream. – usr Apr 25 '14 at 19:38
  • Thanks. Your follow-up is getting very close to my roadblock: I can see a simple case where the reading process "spawns" lots of write processes, but when two readers in two other methods want the same block without making copies it gets hairy fast (conceptually and practically). – downwitch Apr 25 '14 at 20:47
  • 1
    Yes, that would be the demultiplex stuff I talked about. At the moment I cannot think of a simple way of doing that. You need a way to allow multiple readers to read the same stream. That requires threading stuff. Certainly too much to write up in an answer I'm afraid. Maybe you can turn around the readers so that they can be written to instead. That would allow you to use the simple algorithm that I added to this answer. – usr Apr 25 '14 at 21:24