0

Is there a thread-safe way to concurrently consume the stdout from an external process, using ProcessBuilder in Java 1.6?

Background: I need to invoke pbzip2 to unzip large files to stdout and to process each line as the file is decompressed (pbzip2 utilizes multiple CPUs, unlike other implementations).

The logical approach is to create a child thread to loop over the InputStream (i.e. stdout; don't you just love the naming?), as follows:

while((line = reader.readLine()) != null)
{
     // do stuff
}

However, unzipping is slow, so what I really need is for the reader.readLine method to quietly wait for the next line(s) to become available, instead of exiting.

Is there a good way to do this?

Rob
  • 5,512
  • 10
  • 41
  • 45
  • 1
    The reader.readLine() does exactly what you want. Does your implementation exit too early? – akarnokd Jun 22 '09 at 16:26
  • No, the documentation I had didn't specify if it would wait or not, so it left me wondering is this was thread-safe -- for example, whether, if the timing just happened to be out, readLine would think the stream is closed. – Rob Jun 23 '09 at 13:27

3 Answers3

2

You should be able to wrap your input stream with an InputStreamReader and BufferedReader. You can then call readLine() and that will block as required.

Note that you should have a corresponding reader for the stderr. You don't have to do anything with it, but you will need to consume the stderr stream, otherwise your spawned process may well block. See this answer for links etc.

Community
  • 1
  • 1
Brian Agnew
  • 268,207
  • 37
  • 334
  • 440
1

You more or less have the solution yourself. You just create a new thread which reads the next line in a loop from the stream of your external process and processes that line.

readLine() will block and wait until an entire new line is available. If you're on a multicore/processor machine, your external process can happily continue unzipping while your thread processes a line. Atleast unzipping can continue until the OS pipes/buffers becomes full.

Just note that if your processing is slower than unzipping, you'll block the unzipping, and at this point it becomes a memory vs speed issue. e.g. you could create one thread that does nothing but read lines(so unzipping will not block), buffer them up in a queue in memory and a another thread - or even several, that consumes said queue.

readLine method to quietly wait for the next line(s) to become available, instead of exiting

nd that's exactly what readLine should do, it will just block until a whole line is available.

nos
  • 223,662
  • 58
  • 417
  • 506
1

Yes.

I have written some code that kicks off a time consuming job (ffmpeg) in a Process (spawned by process builder), and it in turn kicks off my OutputStreamReaderclass that is an extention of Thread that consumes the stdio and does some magic with it.

The catch (for me) was redirecting the error stream. Here is my code snippet:

        procbbuilder.redirectErrorStream(true);
        proc = pb.start();
        err = new MyOutputStreamReader(this, proc.getInputStream());  //extenion of thread
        err.start();

        int exitCode = proc.waitFor();
Stu Thompson
  • 38,370
  • 19
  • 110
  • 156