3

SPOILER - The question contains solution to an online judge problem.

I solved this question on LeetCode.

My solution got accepted and was better than 92% of the submissions. In order to optimize my solution I made one particular change. The modified solution can be found here. The only change I made is the following:

Instead of using s.length() everytime (specifically 19 occurrences), I created a new variable int len=s.length().

I was expecting the performance to improve but the new solution was better than "only 69% of the submissions" i.e. A 25% decrease in performance. Moreover, while it took 1ms for all the test cases to pass in the first case, it took 2ms in the second case, a difference though not considerable, was something I had not expected at all. What could be the reason behind this?

alphacentauri
  • 1,000
  • 3
  • 14
  • 26
  • 1
    `s.length()` is known to be constant; `int len` isn't necessarily constant, so it might have to keep re-fetching the value. – Andy Turner Dec 22 '16 at 15:02
  • 2
    You can use `final int len` instead, this tells the compiler your value is a constant and it can optimize further – Christophe Roussy Dec 22 '16 at 15:03
  • 3
    Please dont link to external sites for source code. Better include the relevant parts of your code **within** your question. – GhostCat Dec 22 '16 at 15:04
  • 1
    All code you want to ask about should be included *in the text* of your question. References to resources hosted elsewhere are not acceptable for conveying the essential elements of your question. – John Bollinger Dec 22 '16 at 15:05
  • 1
    How often did you try the improved version and how much difference in terms of cpu time is that? The difference between 92% and 69% might just be random CPU spikes on the test system. – zapl Dec 22 '16 at 15:08
  • @ChristopheRoussy I was hoping it would have some impact but it didn't. – alphacentauri Dec 22 '16 at 15:14
  • I have just compared both versions: I ran them repeatedly and compared the assembler code that was generated by the JIT, and except for a few (of several hundred) instructions, the resulting code is identical in both cases. So what you observed is indeed only an "artifact" of the time measurement that is done by the site that the code was submitted to. – Marco13 Dec 22 '16 at 16:03
  • You could try to find out what's happening behind the scenes using JitWatch ([github](https://github.com/AdoptOpenJDK/jitwatch)). Here's an [article](http://www.oracle.com/technetwork/articles/java/architect-evans-pt1-2266278.html) that provides an introduction to JIT (using above mentioned tool). –  Dec 22 '16 at 17:08

2 Answers2

4

The generic answer to your generic question is: it depends.

You see, the real optimization isn't done by the Java compiler. The java-to-bytecode compiler uses only a very limited set of known optimization techniques; for example constant folding (turning 5*3 into 15).

The real "bang for the buck" happens at runtime, by the Just-in-time compiler. And you have to understand that probably 15+ years of intensive research went into that technology. So there is simply no way of telling you in one, simple, comprehensive answer what the JIT is capable of. Or what exactly it is doing. In essence, it observes what your code is doing; and where it makes sense, it translates bytecode into machine code. And sometimes, it will even redo its translations; when it discovers that "thigns" have changed.

See here or there for some glimpses of what I am talking about.

And of course, as some comments point out: you have no insight at all regarding how that "judging system" is doing its benchmarking. And you know, doing benchmarking with Java is hard. So, if you are really curious about your code: step back, create your own measurement suite (based on those best practices from my link above) and measure your own data points.

Community
  • 1
  • 1
GhostCat
  • 137,827
  • 25
  • 176
  • 248
  • 2
    The key point, roughly related to this answer, may be: You don't know how this site is measuring the time. They are *likely* executing the code *once*, and this doesn't tell you anything. If they ran the code repeatedly, with different inputs, and allowed the JIT to kick in, and took the time measurement after such a warmup phase, then the results would most likely be vastly different. – Marco13 Dec 22 '16 at 15:11
  • I was adding another paragraph on that topic just before reading your comment ;-) – GhostCat Dec 22 '16 at 15:13
1

If you get into reading Java byte code, the command javap -p -s -v -c -constants Something.class will become quite handy.

Typically in a s.length() scenario, you do a call on a different class, creating a different stack frame, which is used to evaluate that call.

In an int x = s.length() scenario, you do that same call, but you do an additional call to store the integer within your stack frame at one of the frame storage locations.

This means that reusing the stored call becomes faster or slower based on a number of confounding factors.

  1. If you were going to call s.length() many times, you would have triggered the limit for hotspot's call optimization earlier, leading to even faster execution, after the s.length() was compiled or inlined into your call stack.
  2. If s.length() is ridiculously slow or complex, or you are only going to call it a few times, then caching the value might provide the faster execution, because hotspot might refuse to optimize the call.

These are of course, rules of thumb I've developed after reversing / assembling / reading code and like all good rules of thumb, are nearly useless in a specific scenario. If you have a specific scenario, benchmark. There are many "ifs" in both observations #1 and #2, and most of the time I don't realize which ones are in play without benchmarking.

With that in mind, in general, and especially prior to any hotspot triggered optimization, the smaller the number of bytecode operations in a method, generally the faster it runs. Also, stack frames are (pre-optimization) more expensive that you might be inclined to think (CPU wise), but hotspot does a great job of mitigating that cost for the most heavily used invocations.

Edwin Buck
  • 69,361
  • 7
  • 100
  • 138