0

While poking around the JDK 1.7 source I noticed these methods in Boolean.java:

public static Boolean valueOf(String s) {
    return toBoolean(s) ? TRUE : FALSE;
}

private static boolean toBoolean(String name) {
    return ((name != null) && name.equalsIgnoreCase("true"));
}

So valueOf() internally calls toBoolean(), which is fine. I did find it interesting to read how the toBoolean() method was implemented, namely:

  1. equalsIgnoreCase() is reversed from what I would normally do (put the string first), and then
  2. there is a null check first. This seems redundant if point 1 was adopted; as the first/second check in that method is a null check.

So I thought I would put together a quick test and check how my implementation would work compared with the JDK one. Here it is:

public class BooleanTest {
    private final String[] booleans = {"false", "true", "null"};

    @Test
    public void testJdkToBoolean() {

        long start = System.currentTimeMillis();

        for (int i = 0; i < 1000000; i++) {
            for (String aBoolean : booleans) {
                Boolean someBoolean = Boolean.valueOf(aBoolean);
            }
        }

        long end = System.currentTimeMillis();

        System.out.println("JDK Boolean Runtime is: " + (end-start));
    }

    @Test
    public void testModifiedToBoolean() {
        long start = System.currentTimeMillis();

        for (int i = 0; i < 1000000; i++) {
            for (String aBoolean : booleans) {
                Boolean someBoolean = ModifiedBoolean.valueOf(aBoolean);
            }
        }

        long end = System.currentTimeMillis();

        System.out.println("ModifiedBoolean Runtime is: " + (end-start));
    }
}

class ModifiedBoolean {
    public static Boolean valueOf(String s) {
        return toBoolean(s) ? Boolean.TRUE : Boolean.FALSE;
    }

    private static boolean toBoolean(String name) {
        return "true".equalsIgnoreCase(name);
    }
}

Here is the result:

Running com.app.BooleanTest
JDK Boolean Runtime is: 37
ModifiedBoolean Runtime is: 34
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.128 sec

So not much of a gain, especially when distributed over 1m runs. Really not all that surprising.

What I would like to understand is how these differ at the bytecode level. I am interested in delving into this area but don't have any experience. Is this more work than is worth while? Would it provide a useful learning experience? Is this something people do on a regular basis?

acanby
  • 2,936
  • 24
  • 26
  • 4
    from what I've seen, convention is to do `stringVariable.equals("string literal")`, not the other way around. – yts Dec 23 '14 at 00:46
  • 4
    Microbenchmarks are notoriously unreliable. What is yours supposed to be showing? – Elliott Frisch Dec 23 '14 at 00:49
  • 4
    Seems like you should have a `null` literal, not a `"null"` string in your `booleans` array. Additionally, these sorts of micro-benchmarks are very susceptible to the behavior JVM/JIT; it's difficult to draw conclusions from code like this. – dimo414 Dec 23 '14 at 00:50
  • 1
    Speaking of learning: 1) check out [JMH](http://openjdk.java.net/projects/code-tools/jmh/) and talks about benchmarking from [here](http://shipilev.net/) 2) Read the byte code with `javap` 3) Read assembly code generated by the [JIT at runtime](http://stackoverflow.com/questions/1503479/how-to-see-jit-compiled-code-in-jvm). Learning these things may help you understand some of the inner workings of the JVM. Most people have no idea about all this, never do it and don't miss it. – Andrey Breslav Dec 23 '14 at 03:22
  • Bytecode has little direct correlation with performance since the JVM optimizes hot code anyway. – Antimony Dec 23 '14 at 06:02
  • I don't think, this exercise is very helpful. If you want speed / real time, use assembler/c/whatever. Usually, in enterprise environment, application code is the fastest part. You would worry about thinks like "Why does the SSL Handshake take forever", "I got packet loss!", "Oracle takes the wrong execution plan!", "why does this user's time to first byte twice as long?" or "What idiot changed the routing table, so my packets turn in the other data center??" hehe – slowy Dec 23 '14 at 09:26

2 Answers2

3

There would be no performance gain for a couple of reasons:

  1. It's just not that expensive of an operation to check whether or not name == null.
  2. The thing that takes time is loading the value of name...which has to be loaded in either case.
  3. name==null is faster then calling String.equalsIgnoreCase since it's a simple equality test rather than a function call.
  4. These don't matter anyway because the architecture will likely use predictive branching and thus if most of your calls aren't for null strings, the architecture will start loading the branching instructions as if your strings are not null.
Jared
  • 940
  • 5
  • 9
0

First, bytecode is very close to Java source. It can't give you much more information about the performance except some special cases (e.g. compile-time expression evaluation). Much more important is JIT compilation done by the JVM.

Some background: In early Java versions, it was rather a machine-well-readable version of source code. Decompiling such early Java versions is rather straightforward. You will lose comments and code will be slightly different. The hardest work of such decompiler is probably reconstructing the loops. In today Java versions, the decompilers have to be slightly more complex, because the language has been changed (inner classes, generics, …) more than the bytecode. But the bytecode is still very close to the source, even today.

Second, the redundant null check might not be important. JVM is able to remove some unneeded checks, even the automatically generated array bounds checks if they are surely unneeded.

Third, benchmarks are very tricky and even more tricky on the JVM. JVM "warms up", so the second benchmark might benefit from some optimizations done for the first benchmark. In some cases, the opposite might also happen – some optimistic optimisation must be discarded and the second benchmark is slower. Moreover, running the code only once creates huge error in the results.

v6ak
  • 1,636
  • 2
  • 12
  • 27