2

I'm using proc_open in php to call java application, pass it text to be processed and read output text. Java execution time is quite long and I found the reason for that is reading input takes most of the time. I'm not sure whether it's php's or java's fault.

My PHP code:

$process_cmd = "java -Dfile.encoding=UTF-8 -jar test.jar";

$env = NULL;

$options = ["bypass_shell" => true];
$cwd = NULL;
$descriptorspec = [
    0 => ["pipe", "r"],     //stdin is a pipe that the child will read from
    1 => ["pipe", "w"],     //stdout is a pipe that the child will write to
    2 => ["file", "java.error", "a"]
];

$process = proc_open($process_cmd, $descriptorspec, $pipes, $cwd, $env, $options);

if (is_resource($process)) {

    //feeding text to java
    fwrite($pipes[0], $input);
    fclose($pipes[0]);

    //reading output text from java
    $output = stream_get_contents($pipes[1]);
    fclose($pipes[1]);

    $return_value = proc_close($process);

}

My java code:

public static void main(String[] args) throws Exception {

    long start;
    long end;

    start = System.currentTimeMillis();

    BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
    String in;
    String input = "";
    br = new BufferedReader(new InputStreamReader(System.in));
    while ((in = br.readLine()) != null) {
        input += in + "\n";
    }

    end = System.currentTimeMillis();
    log("Input: " + Long.toString(end - start) + " ms");


    start = System.currentTimeMillis();

    org.jsoup.nodes.Document doc = Jsoup.parse(input);

    end = System.currentTimeMillis();
    log("Parser: " + Long.toString(end - start) + " ms");


    start = System.currentTimeMillis();

    System.out.print(doc);

    end = System.currentTimeMillis();
    log("Output: " + Long.toString(end - start) + " ms");

}

I'm passing to java html file of 3800 lines (~200KB in size as a standalone file). These are broken down execution times in the log file:

Input: 1169 ms
Parser: 98 ms
Output: 12 ms

My question is this: why does input take 100 times longer than output? Is there a way to make it faster?

Caballero
  • 11,546
  • 22
  • 103
  • 163
  • You have **twice**: `new BufferedReader(new InputStreamReader(System.in))` - that could be painful. Of course `String +=` instead of `StringBuilder` slows it down. – Joop Eggen Aug 06 '13 at 09:34

1 Answers1

0

Inspect your read block in the Java program: Try to use a StringBuilder to concat the data (instead of using += on a String):

String in;
StringBuilder input = new StringBulider();
br = new BufferedReader(new InputStreamReader(System.in));
while ((in = br.readLine()) != null) {
    input.append(in + "\n");
}

Details are covered here: Why using StringBuilder explicitly


Generally speaking, to make it faster, consider using an application server (or a simple socket based server), to have a permanently running JVM. There is always some overhead when you start a JVM, on top of it the JIT needs some time as well to optimize your code. This effort is lost, after the the JVM exits.

As for the PHP program: Try to feed the Java program from the shell, just use cat to pipe the data (on a UNIX system like Linux). As an alternative, rewrite your Java program to accept a command line parameter for the file as well. Then you can judge, if your PHP code pipes the data fast enough.

As for the Java program: If you do performance analysis, consider the recommendations in How do I write a correct micro-benchmark in Java

Community
  • 1
  • 1
Beryllium
  • 12,808
  • 10
  • 56
  • 86
  • Thanks for your answer, but the execution times measured are inside application - after JVM is running already, so this has nothing to do with my question. I'm not quite sure what do you mean by "just use cat to pipe the data". I'm developing on Windows, but the production server is Linux. I specifically used pipes to cut out the overhead of writing and reading files - that was an answer to my previous question - what's the fastest way of communication between php and java. – Caballero Aug 06 '13 at 09:12
  • The measurement is done, after the JVM has been started, but the JIT still might play a role. While I setup a test case, I've stumbled on the fact, that there is no `StringBuilder`, so that should solve it on the Java side. – Beryllium Aug 06 '13 at 09:31
  • Thanks, StringBuilder method cut input time to 5 ms. This is it. – Caballero Aug 06 '13 at 09:44