29

I just made this simple "program":

public static void main(String[] args) {
        int i = 1;
        int k = 0;
        while (true) {
            if(++i==0) System.out.println("loop: " + ++k);
        }
    }

Upon running this program, I immediately get the output:

(...)
loop: 881452
loop: 881453
loop: 881454
loop: 881455
loop: 881456
loop: 881457
loop: 881458
(...)

as if i would be always 0.

And in fact, when I debug in Eclipse, upon suspending the program, i would be always zero. When stepping through the loop, i would increment, but upon resuming and suspending the debugger, i is 0 again.

When I change i to long, upon running the program I need to wait quite a while before seeing the first loop: 1. In the debugger, upon pausing the program, i does increment: it's not 0, so it works as it should.

What's the problem with ++i as an int?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
SlumpA
  • 882
  • 12
  • 24
  • 11
    Is `i` overflowing over and over again? – Chris Sprague Dec 14 '15 at 20:51
  • I know that it is overflowing. I'm just puzzled that it's overflowing THAT fast. And I find it strange that each time I suspend the debugger, i is 0. – SlumpA Dec 14 '15 at 20:59
  • @SlumpA See if my update addresses your surprise. – erickson Dec 14 '15 at 21:06
  • 4
    There's no way for an `int` to overflow hundreds of thousands of times per second. Very probably the compiler is optimizing out the whole conditional branch and just running `println` as fast as it can. – JohannesD Dec 15 '15 at 00:07
  • 1
    @JohannesD Your guess was right, see [my answer below](http://stackoverflow.com/a/34287776/3182664) : The JIT indeed seems to optimize away the increment of `i` in this case. – Marco13 Dec 15 '15 at 11:16
  • @SlumpA You mentioned that, when `i` is a `long` value, you *"need to wait quite a while"* - what is "quite a while" here? I've started such a program, and it's running for maybe 10 minutes now. Even in the best case, causing an overflow of a `long` value should take *a few hundred years*.... – Marco13 Dec 15 '15 at 12:05
  • @Marco13 Yeah, sry... I said quite a while, because nothing happened after I waited for 30 secs before stopping the program and checking the value of i with the debugger. I guess what I wanted to say is that there is a notable difference between long and int. Sry for misleading you. On the other hand, I find it strange that that loop is optimised for ++int, but it isn't optimised for ++long... – SlumpA Dec 15 '15 at 16:57
  • Yes, it's a bit odd. One could guess: "The loop has to be executed at least once before optimization kicks in", but this does not seem to be the case: The `int` version also seems to be optimized immediately by the JIT. However, the JIT is a tremendously complex beast, and it's hard to argue about it without being *really* deeply involved. You might want to have a look at [this comment in a (very) remotely related question](http://stackoverflow.com/questions/25326377/jit-not-optimizing-loop-that-involves-integer-max-value#comment39509265_25326461) and the OpenJDK code that is linked there – Marco13 Dec 15 '15 at 17:11

5 Answers5

48

If you keep incrementing an integer type, it will eventually overflow, becoming a large negative value. If you keep going, it will eventually become 0 again, and the cycle will repeat.

There are convenience methods to help avoid inadvertent overflows, like Math.addExact(), but these wouldn't normally be used in a loop.


I know that it is overflowing. I'm just puzzled that it's overflowing THAT fast. And I find it strange that each time I suspend the debugger, i is 0.

When you suspend a running thread, consider the chance of the thread being in a slow call to println() that traversing a huge stack of Java and native OS code, versus the probability of landing in the test of your while loop, which is just incrementing a local variable. You'd have to have a pretty quick trigger finger to see anything other than the print statement. Try stepping through instead.

When something happens 4 billion times in a row, it's a pretty good guess it will happen next time. Branch prediction will help here in any case, and it's possible that your optimizing runtime removes the increment operation and test entirely, since the intervening values of i are never read.

erickson
  • 265,237
  • 58
  • 395
  • 493
  • Well, looking at the given example output, it seems that each loop produces a 0. – bezmax Dec 14 '15 at 20:53
  • 1
    @PeterLawrey sorry, just noticed that k only increments when printed, not every loop. – bezmax Dec 14 '15 at 20:56
  • Yeah, that's it. I tried setting that if clause to if(++i==500) and you're right: each time I suspended the program, the i was in fact 500. – SlumpA Dec 14 '15 at 21:08
  • the really big factor here is branch prediction. – njzk2 Dec 14 '15 at 23:06
  • 8
    Even on a modern processor, incrementing an integer 4 billion times should take an appreciable fraction of a second to complete. It seems to me that there is something else than just branch prediction in play. – JohannesD Dec 14 '15 at 23:47
  • 6
    Specifically, I suspect the compiler optimizes out both the incrementing and conditional branching as it can prove they have no visible effects other than slowing things down. – JohannesD Dec 15 '15 at 00:03
  • 1
    @JohannesD testing the code out myself (and disassembling the compiled code), it appears that the `javac` compiler does *not* optimize out the variable `i`, but the java runtime probably *does*. – Jonathan Callen Dec 15 '15 at 07:13
  • 1
    "When something happens 4 billion times in a row, it's a pretty good guess it will happen next time." I died laughing. – user2962533 Dec 15 '15 at 10:49
  • 1
    @Jonathan Callen: of course, it’s the **JIT** compiler doing such optimizations, not `javac`… – Holger Dec 15 '15 at 11:41
  • @JonathanCallen Yeah, I wasn't sure whether it was javac or the JIT compiler so left it ambiguous, should have elaborated. – JohannesD Dec 16 '15 at 16:39
  • Why "these wouldn't normally be used in a loop"? When I have a loop with variable which can eventually overflow in case which isn't seen in moment of writing this code I think `Math.addExact` should be used. For the sake of my question let's ignore the fact that in those cases bigger variables should be used. If I have a loop which will normally execute about 200 times I would use int, but I can't be sure that some bad thing won't happen and loop won't execute more that integer max value. Then I think `Math.addExact` is good solution, because it will give me exception in case of any problems. – ctomek Dec 16 '15 at 22:33
  • 1
    @ctomek Idiomatic loops don't overflow: `for (int i = 0; idx < max; ++idx) ...` I guess I'd have to see a sensible loop that might overflow to understand the utility of an `xxxExact()` method there. – erickson Dec 17 '15 at 03:11
  • I was thinking about `while` loop. I think that `while` is not idiomatic (couldn't find any information about that). – ctomek Dec 17 '15 at 08:58
  • 1
    @ctomek Even with a while loop, the key may be whether you are doing a `<` comparison rather than `<=`. The "max minus one" behavior that you get with `<` provides a margin of safety that evaporates when there's a chance your counter might exceed `MAX_VALUE`. – erickson Dec 17 '15 at 18:20
  • I was thinking about example of code from question where there is no range end condition. `while (true) { if (++i==0) System.out.println("loop: " + +k); }` The queston is about such piece of code and naturally I thought that answer refers to code from question and I was surprised when you said that `addExact` would not normally used in a loop. Because in that loop it would be used. I think you meant it would not normally be used in loops with range end condition, but in other loops it should be used. – ctomek Dec 17 '15 at 19:50
  • 1
    @ctomek An `if` test isn't a loop, and I wouldn't use `addExact()` in this particular `if` test. Maybe `incrementExact()` would be okay though. It does only what you need and should be fast. This example is a poor one to illustrate use of the `xxxExact()` methods since the code in question shouldn't have been written this way in the first place. – erickson Dec 19 '15 at 17:24
22

As JohannesD suggested in a comment, it's hardly possible to count from 0 to Integer.MAX_VALUE (and, after the overflow, from -Integer.MAX_VALUE to 0 again) so quickly.

In order to verify the assumption that the JIT does some magic optimization here, I created a slightly modified program, introducing some methods make it easier to identify parts of the code:

class IntOverflowTest
{
    public static void main(String[] args) {
        runLoop();
    }

    public static void runLoop()
    {
        int i = 1;
        int k = 0;
        while (true) {
            if(++i==0) doPrint(++k);
        }
    }

    public static void doPrint(int k)
    {
        System.out.println("loop: " + k);
    }

}

The bytecode emitted and shown with javap -c IntOverflowTest brings no surprises:

class IntOverflowTest {
  IntOverflowTest();
    Code:
       0: aload_0
       1: invokespecial #1                  
       4: return

  public static void main(java.lang.String[]);
    Code:
       0: invokestatic  #2                  
       3: return

  public static void runLoop();
    Code:
       0: iconst_1
       1: istore_0
       2: iconst_0
       3: istore_1
       4: iinc          0, 1
       7: iload_0
       8: ifne          4
      11: iinc          1, 1
      14: iload_1
      15: invokestatic  #3                  
      18: goto          4

  public static void doPrint(int);
    Code:
       0: getstatic     #4                  
       3: new           #5                  
       6: dup
       7: invokespecial #6                  
      10: ldc           #7                  
      12: invokevirtual #8                  
      15: iload_0
      16: invokevirtual #9                  
      19: invokevirtual #10                 
      22: invokevirtual #11                 
      25: return
}

It clearly does increment both local variables (runLoop, offsets 4 and 11).

However, when running the code with -XX:+UnlockDiagnosticVMOptions -XX:+LogCompilation -XX:+PrintAssembly in a Hotspot Disassembler, the machine code eventually ends up to be the following:

Decoding compiled method 0x00000000025c2c50:
Code:
[Entry Point]
[Verified Entry Point]
[Constants]
  # {method} {0x000000001bb40408} 'runLoop' '()V' in 'IntOverflowTest'
  #           [sp+0x20]  (sp of caller)
  0x00000000025c2da0: mov    %eax,-0x6000(%rsp)
  0x00000000025c2da7: push   %rbp
  0x00000000025c2da8: sub    $0x10,%rsp         ;*synchronization entry
                                                ; - IntOverflowTest::runLoop@-1 (line 10)

  0x00000000025c2dac: mov    $0x1,%ebp          ;*iinc
                                                ; - IntOverflowTest::runLoop@11 (line 13)

  0x00000000025c2db1: mov    %ebp,%edx
  0x00000000025c2db3: callq  0x00000000024f6360  ; OopMap{off=24}
                                                ;*invokestatic doPrint
                                                ; - IntOverflowTest::runLoop@15 (line 13)
                                                ;   {static_call}
  0x00000000025c2db8: inc    %ebp               ;*iinc
                                                ; - IntOverflowTest::runLoop@11 (line 13)

  0x00000000025c2dba: jmp    0x00000000025c2db1  ;*invokestatic doPrint
                                                ; - IntOverflowTest::runLoop@15 (line 13)

  0x00000000025c2dbc: mov    %rax,%rdx
  0x00000000025c2dbf: add    $0x10,%rsp
  0x00000000025c2dc3: pop    %rbp
  0x00000000025c2dc4: jmpq   0x00000000025b0d20  ;   {runtime_call}
  0x00000000025c2dc9: hlt

One can clearly see that it does not increment the outer variable i any more. It only calls the doPrint method, increments a single variable (k in the code), and then and immediately jumps back to the point before the doPrint call.

So the JIT indeed seems to detect that there is no real "condition" involved for printing the output, and that the code is equivalent to an infinite loop that only prints and increments a single variable.

This seems like a quite sophisticated optimization for me. I would expect that it's far from trivial to detect a case like this. But obviously, they managed to do so...

Community
  • 1
  • 1
Marco13
  • 53,703
  • 9
  • 80
  • 159
11

Your loop is overflowing i. You have no break, so after a period of time, i wraps back to 0, and this prints the statement and increments k. This also explains why changing the int to a long causes the printing to slow down: it takes much longer for a long value to overflow.

mk.
  • 11,360
  • 6
  • 40
  • 54
10

First let's look at what that loop logically does.

i will repeatedly overflow. Every 232 (about 4 billion) iterations of the loop the output will be printed and k will be incremented.

That is the logical view. However, compilers and runtimes are allowed to optimise and if you are getting more than a value every second or so it is pretty clear that such optimisation must be taking place. Even with modern branch prediction, out of order execution, etc. I find it unlikely that a CPU would get round a tight loop more than once per clock cycle (and even that I would consider unlikely). The fact that you never see anything other than zero in a debugger reinforces the idea that the code is being optimised away.

You mention it takes longer when "long" is used and that you do see other values. If a "long" counter was used in an unoptimised loop you would expect many decades between values. Again clearly optimisation is taking place, but it seems the optimiser is giving up before it has fully optimised away the pointless iterations.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
plugwash
  • 9,724
  • 2
  • 38
  • 51
4

It's not always 0, it becomes 0 after looping over (integer overflow), so it would first become Integer.MAX_VALUE, then Integer.MIN_VALUE, and then work its way up to 0 again. That's why it seems as if it is always 0, but in fact it takes all possible integer values before becoming 0... Over and over again.

Erik Živković
  • 4,867
  • 2
  • 35
  • 53