3

Sub-process in java are very expensive. Each process is usually support by a NUMBERS of threads.

  • a thread to host the process (by JDK 1.6 on linux)
  • a thread to read to read/print/ignore the input stream
  • another thread to read/print/ignore the error stream
  • a more thread to do timeout and monitoring and kill sub-process by your application
  • the business logic thread, holduntil for the sub-process return.

The number of thread get out of control if you have a pool of thread focking sub-process to do tasks. As a result, there may be more then a double of concurrent thread at peak.

In many cases, we fork a process just because nobody able to write JNI to call native function missing from the JDK (e.g. chmod, ln, ls), trigger a shell script, etc, etc.

Some thread can be saved, but some thread should run to prevent the worst case (buffer overrun on inputstream).

How can I reduce the overhead of creating sub-process in Java to the minimum? I am thinking of NIO the stream handles, combine and share threads, lower background thread priority, re-use of process. But I have no idea are they possible or not.

Dennis C
  • 24,511
  • 12
  • 71
  • 99
  • Why this (high number of threads) is a problem? Most of these threads would be "waiting" for something, E.g. waiting for something to be written by Child process. So these threads should not consume any CPU. – Shamit Verma Mar 14 '11 at 10:47
  • High number of threads is harm of effectively. In term of memory - every thread allocate it's own stack. In term of CPU - context switching degrade your CPU power. And every thread come with locks, object that programmers have to take care about. – Dennis C Jul 06 '11 at 02:08

7 Answers7

4

JDK7 will address this issue and provide new API redirectOutput/redirectError in ProcessBuilder to redirect stdout/stderr.

However the bad news is that they forget to provide a "Redirect.toNull" what mean you will want to do something like "if(*nix)/dev/null elsif(win)nil"

Unbeliable that NIO/2 api for Process still missing; but I think redirectOutput+NIO2's AsynchronizeChannel will help.

Dennis C
  • 24,511
  • 12
  • 71
  • 99
  • 1
    Java 9 added [`ProcessBuilder.Redirect.DISCARD`](https://docs.oracle.com/en/java/javase/13/docs/api/java.base/java/lang/ProcessBuilder.Redirect.html#DISCARD). – Slaw Jan 07 '20 at 02:09
2

I have created an open source library that allows non-blocking I/O between java and your child processes. The library provides an event-driven callback model. It depends on the JNA library to use platform-specific native APIs, such as epoll on Linux, kqueue/kevent on MacOS X, or IO Completion Ports on Windows.

The project is called NuProcess and can be found here:

https://github.com/brettwooldridge/NuProcess

brettw
  • 10,664
  • 2
  • 42
  • 59
1

To answer your topic (I don't understand description), I assume you mean shell subprocess output, check these SO issues:

platform-independent /dev/null output sink for Java

Is there a Null OutputStream in Java?

Or you can close stdout and stderr for the command being executed under Unix:

command > /dev/null 2>&1
Community
  • 1
  • 1
gertas
  • 16,869
  • 1
  • 76
  • 58
  • 2
    It was sound great, but actually it create more. ">" is not a valid command but a shell feature from bash/cmd. You will have to start a wrapper bash process for the output redirection of your target process. – Dennis C Mar 03 '11 at 01:51
  • Right, easier you can create gateway bash script which receives command with arguments as parameter and takes care of output. – gertas Mar 03 '11 at 08:49
1

nio won't work, since when you create a process you can only access the OutputStream, not a Channel.

You can have 1 thread read multiple InputStreams.

Something like,

import java.io.InputStream;
import java.util.List;
import java.util.concurrent.CopyOnWriteArrayList;

class MultiSwallower implements Runnable {    
   private List<InputStream> streams = new CopyOnWriteArrayList<InputStream>();
   public void addStream(InputStream s) {
       streams.add(s);
   }

   public void removeStream(InputStream s) {
       streams.remove(s);
   }

    public void run() {
        byte[] buffer = new byte[1024];
        while(true) {

          boolean sleep = true;
          for(InputStream s : streams) {
              //available tells you how many bytes you can read without blocking
              while(s.available() > 0) {
                  //do what you want with the output here
                  s.read(buffer, 0, Math.min(s.available(), 1024));
                  sleep = false;
              }   
          }
          if(sleep) {
              //if nothing is available now
              //sleep 
              Thread.sleep(50);
          }
        }

    }
}

You can pair the above class with another class that waits for the Processes to complete, something like,

class ProcessWatcher implements Runnable {

    private MultiSwallower swallower = new MultiSwallower();

    private ConcurrentMap<Process, InputStream> proceses = new ConcurrentHashMap<Process, InputStream>();

    public ProcessWatcher() {

    } 

    public void startThreads() { 
        new Thread(this).start();
        new Thread(swallower).start();
    }


    public void addProcess(Process p) {
        swallower.add(p.getInputStream());
        proceses.put(p, p.getInputStream());

    }

    @Override
    public void run() {
        while(true) {

            for(Process p : proceses.keySet()) {
                try {
                    //will throw if the process has not completed
                    p.exitValue();
                    InputStream s = proceses.remove(p);
                    swallower.removeStream(s);
                } catch(IllegalThreadStateException e) { 
                    //process not completed, ignore
                }
            }
            //wait before checking again
            Thread.sleep(50);
        }
    }
}

As well, you don't need to have 1 thread for each error stream if you use ProcessBuilder.redirectErrorStream(true), and you don't need 1 thread for reading the process input stream, you can simply ignore the input stream if you are not writing anything to it.

sbridges
  • 24,960
  • 4
  • 64
  • 71
  • I've thought it before. But it has a big bug - it won't close the stream nor release for GC. May need extra use of SoftReference or finally to do garbage reference collection. Something little too much for a small util in an app. – Dennis C Mar 13 '11 at 13:36
  • Let me think... what I have to improve from your code are - 1. Weak/SoftReference; 2. a concurrent lock to replace the Thread.sleep; 3. Using AOP or patterns to capture waitFor() and destory(). 4. Make sure it is fast. 5. Submit it to Apache Common-IO, Google-IO, and OpenJDK and let they do the maintenance. – Dennis C Mar 13 '11 at 13:42
  • The StreamSwallower class won't close the streams by itself, I added a ProcessWatcher class to the answer to show how you would manage the processes/streams. I don't think Weak/Soft references are needed. With the code above, you have 2 threads managing many processes. – sbridges Mar 13 '11 at 15:24
1

You don't need any extra threads to run a subprocess in java, although handling timeouts does complicate things a bit:

import java.io.IOException;
import java.io.InputStream;

public class ProcessTest {

  public static void main(String[] args) throws IOException {
    long timeout = 10;

    ProcessBuilder builder = new ProcessBuilder("cmd", "a.cmd");
    builder.redirectErrorStream(true); // so we can ignore the error stream
    Process process = builder.start();
    InputStream out = process.getInputStream();

    long endTime = System.currentTimeMillis() + timeout;

    while (isAlive(process) && System.currentTimeMillis() < endTime) {
      int n = out.available();
      if (n > 0) {
        // out.skip(n);
        byte[] b = new byte[n];
        out.read(b, 0, n);
        System.out.println(new String(b, 0, n));
      }

      try {
        Thread.sleep(10);
      }
      catch (InterruptedException e) {
      }
    }

    if (isAlive(process)) {
      process.destroy();
      System.out.println("timeout");
    }
    else {
      System.out.println(process.exitValue());
    }
  }

  public static boolean isAlive(Process p) {
    try {
      p.exitValue();
      return false;
    }
    catch (IllegalThreadStateException e) {
      return true;
    }
  }
}

You could also play with reflection as in Is it possible to read from a InputStream with a timeout? to get a NIO FileChannel from Process.getInputStream(), but then you'd have to worry about different JDK versions in exchange for getting rid of the polling.

Community
  • 1
  • 1
Dan Berindei
  • 7,054
  • 3
  • 41
  • 48
0

Since you mention, chmod, ln, ls, and shell scripts, it sounds like you're trying to use Java for shell programming. If so, you might want to consider a different language that is better suited to that task such as Python, Perl, or Bash. Although it's certainly possible to create subprocesses in Java, interact with them via their standard input/output/error streams, etc., I think you will find a scripting language makes this kind of code less verbose and easier to maintain than Java.

Rob H
  • 14,502
  • 8
  • 42
  • 45
  • there are also implementations of these script languages for JVM, which may ease integration with existing Java code – gertas Mar 02 '11 at 14:47
  • No. I am not. They are called by need of the application, not shell programming. – Dennis C Mar 03 '11 at 01:44
0

Have you considered using a single long-running helper process written in another language (maybe a shell script?) that will consume commands from java via stdin and perform file operations in response?

ykaganovich
  • 14,736
  • 8
  • 59
  • 96