2

I've recently been going back and forth about what constitutes good code with a coworker. Specifically, the issue of allocation-within-a-loop came up. I've seen multiple cases (in other languages) where allocating variables within a loop can have severe penalties, and the wisdom I've both experienced and seen given time and time again notes that avoiding these kinds of obviously-dangerous things is best.

However, it seems that this is not widely accepted, with many advising that hotspot and javac should be relied upon to fix any obvious mistakes like allocating within a loop, and in fact encourages the practice.

I'm fine with this in Java, since it seems like it consistently hoists the declaration of the variable for you with no ill effects. But I don't know what other mistakes are encouraged as best practices. Is there a resource that I can use to verify behavior when someone tells me to do something like this?

Community
  • 1
  • 1
Knetic
  • 2,099
  • 1
  • 20
  • 34
  • Your question sounds very broad to me. In general that "resource" is a run time profiler I think. – markspace May 27 '16 at 18:40
  • I [answered this question](http://stackoverflow.com/a/37455467/3788176) showing that the compiled bytecode is identical for variables declared inside and outside the loop. It's not even at a JVM level: javac removes the difference. – Andy Turner May 27 '16 at 18:40
  • He said "allocation" though, I assumed he meant calling `new` (or in other languages, `malloc`). – markspace May 27 '16 at 18:41
  • @markspace I know it's broad, my problem is mainly that I don't know what I don't know - and I haven't found anything that sheds light on what sort of optimizations I can expect from javac/hotspot. – Knetic May 27 '16 at 18:53
  • @markspace I also meant allocating the pointer. Not `new`, but the declaration. – Knetic May 27 '16 at 18:59

1 Answers1

0

There seems to be a big misunderstanding. When comparing

String str;
while(condition){
    str = calculateStr();
    .....
}

and

while(condition){
    String str = calculateStr();
    .....
}

the actual operation, calculateStr() and the allocation of the resulting string, always happens inside the loop. So in both cases, it requires HotSpot to find out whether the operation inside calculateStr() is invariant and can be moved out of the loop.

In contrast, the local variable exists within a stack frame, a data structure allocated at method entry and freed at method exit (though “allocation” and “freeing” usually only means “moving the stack pointer”) and being large enough to hold all local variables that may exist at the same time during the method execution.

So in either case, there has to be a location inside the stack frame for holding the value of str (on byte code level, local variables are accessed by an index). But the interesting point is, if you have more than that single local variable, reducing the scope of the variables may reduce the required memory by enabling re-use of the variable’s memory location within the stack frame.

So when you write:

String one;
int two;
while(condition1){
    one = calculateStr();
    .....
}
while(condition2){
    two = calculateInt();
    .....
}

you have two variables within your stack frame, mapping to one and two. In contrast, when you write:

while(condition1){
    String one = calculateStr();
    .....
}
while(condition2){
    int two = calculateInt();
    .....
}

there is only one local variable within the stack frame, mapping to one during the first loop and mapping to two within the second.


Maybe you first need to understand, that neither the declaration of a local variable nor its going out of scope impose any cost. As said, each local variable has its predetermined place within the stack frame and the stack frame is allocated at method entry and persists the entire method execution. The variable comes to live when an actual value is written to it, which happens at the same place in all variants, and when it goes out of scope, nothing will happen. When it is reused later-on, the actual writing of a new value, perhaps even of a different type, establishes the new variable, though from a JVM’s point of view, it doesn’t matter whether it is the same or a different variable.

In other words, the names and scopes and to some degree, even the types, of local variables are a source level artifact only (letting debug meta information aside). At byte code level, they are never created nor destroyed, only written and read. But limiting the source code scope enables reusing of the same location for different variables, potentially reducing the stack frame size.

Holger
  • 285,553
  • 42
  • 434
  • 765