4

Is declaring a variable inside a loop is good or declaring on the fly optimal in Java.Also is there any performance cost involved while declaring inside the loop?

eg.

Option 1: Outside the Loop

List list = new ArrayList();
int value;

//populate list
for(int i = 0 ; i < list.size(); i++) {
  value = list.get(i);
  System.out.println(“value is ”+ value);
}

Option 2: Inside the Loop

List list = new ArrayList();

//populate list
for(int i = 0; i < list.size(); i++) {
  int value = list.get(i);
  System.out.println(“value is ”+ value);
}
tshepang
  • 12,111
  • 21
  • 91
  • 136
Harish
  • 1,617
  • 4
  • 21
  • 20
  • My understanding is that option 1 will only create & use a single memory address, while option 2 will create memory addresses for however many items are in the list. – OMG Ponies Dec 11 '09 at 05:21
  • I think primitive data types like this go on the VM stack when declared locally. If so, there's no difference in performance -- `int value` is "allocated" by reserving one word in the stack frame either way. – Crashworks Dec 11 '09 at 05:36
  • Your understanding is wrong. Semantically, you can think of it as "allocating" a new local variable at start of each loop iteration and then "releasing" it at the end - it stands to reason, then, that at no point you have more than one local "allocated". In practice, it will be a single memory location on the stack all the time (or maybe even a single register). – Pavel Minaev Dec 11 '09 at 05:36
  • _All_ data types go on the stack when declared locally, primitive or not. Non-primitive types are reference types, and the _reference_ goes on the stack. For anything to go on the heap, someone, somewhere, must `new` it (or use an array initializer or a string literal). – Pavel Minaev Dec 11 '09 at 05:38
  • 2
    Note: the samples as written will give compilation errors because the `get` method on a raw `ArrayList` will return an `Object` not an `int`. – Stephen C Dec 11 '09 at 05:40
  • Oh! Of course. Then there's definitely no difference in performance. The int is unlikely to be only kept on a register, though, as the register would surely have to be spilled to stack on or inside calling println (depending on that VM's particular ABI). – Crashworks Dec 11 '09 at 05:41
  • @Pavel - if we're going to be pedantic, a String literal does not make a new String go onto/into the heap. The (intern'ed) String object that represents the literal is created (if necessary) when the class is loaded. – Stephen C Dec 11 '09 at 05:44
  • Wouldn't it be trivial to benchmark this yourself? You've written the code and everything. – Buhb Dec 11 '09 at 12:37
  • It is worth remembering that the JVM is a virtual machine, not a real machine. A real machine has multiple registers so i, list (reference), and value will all sit in registers no matter which combination you use. – Peter Lawrey Dec 11 '09 at 22:23
  • Assuming the machine has enough registers and println is a leaf function that doesn't need or spill any registers of its own. – Crashworks Dec 13 '09 at 01:01
  • possible duplicate of [Which loop has better performance? Why?](http://stackoverflow.com/questions/110083/which-loop-has-better-performance-why) – erickson May 25 '10 at 17:44

8 Answers8

12

In Clean Code, Robert C. Martin advises Java coders to declare variables as close as possible to where they are to be used. Variables should not have greater scope than necessary. Having the declaration of a variable close to where it's used helps give the reader type and initialization information. Don't concern yourself too much with performance because the JVM is pretty good at optimizing these things. Instead focus on readability.

BTW: If you're using Java 5 or greater, you can significantly trim up your code example using the following new-for-Java-5 features:

  • foreach construct
  • generics
  • autoboxing

I've refactored your example to use the aforementioned new features.

List<Integer> list = new ArrayList<Integer>();

// populate list

for (int value : list) {
    System.out.println("value is " + value);
}
Asaph
  • 159,146
  • 25
  • 197
  • 199
  • 1
    ALso, the behavior is not the same between the two options. In the first option, there is a possibility of the state of the 'value' variable 'leaking' between loop iterations. In the second one, that's not possible. And if the perf difference between the two is your biggest worry, you're in a much better position than any code I've ever seen! – kyoryu Dec 11 '09 at 05:25
  • 3
    i agree, always choose legibility over performance - until you get into the optimisation phase at which point you will be timing everything and you will see that this kind of apparent optimisation has no effect at all on the bottom line. i see the "outside the loop" approach way too often and REALLY HATE IT. – pstanton Dec 11 '09 at 05:44
1

It should make no difference which way you implement from a performance perspective.

But more importantly, you should not be wasting your time micro-optimizing your code like this ... UNLESS you've profiled your entire application and determined that this fragment is a performance bottleneck. (And if you've done that, you are in a good position to see if there is really any difference between the two versions of the code. But I would be very, surprised if there was ...)

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • It could make a difference because the register allocation could be different; depending on the register allocation algorithm that's being used variables with a longer 'live range' will generally be spilled to memory while variables with a shorter live range will be stored in the registers. – Jasper Bekkers Dec 11 '09 at 05:43
  • 1
    The only way to be absolutely sure is to profile the application. And that's my real point. – Stephen C Dec 11 '09 at 05:48
1

While your example is a hypothetical, most likely not real world application, the simple answer is that you don't need a variable at all in this scenario. There is no reason to allocate the memory. Simply put it's wasted memory that becomes cannon fodder in the JVM. You've already allocated memory to store the value in a List, why duplicate it in another variable?

The scope of the variable's use will often dictate where it should be declared. For instance:

Connection conn = null;
try {
    conn.open();
} catch (Exception e) {
    e.printStackTrace();
} finally {
    conn.close();
}

The other answers do a good job of exposing the other pitfalls of the example you provided. There are better ways to iterate through a list, whether it's with an actual Iterator, or a foreach loop, and generics will help eliminate the need to create a primitive duplicate.

Droo
  • 3,177
  • 4
  • 22
  • 26
  • FYI: Regarding your `Connection` example; Java 7 will include a new language feature that makes it possible to scope `conn` inside the `try{}`. Check out http://code.joejag.com/2009/new-language-features-in-java-7/ and scroll down to "Automatic Resource Management". – Asaph Dec 11 '09 at 22:31
0

Well, if you are worrying about optimizing that code - I'm not sure about how Java evaluates for loops, but having the list.size() being called inside the loop declaration maybe less efficient than outide the loop (and setting it to a varable listLength perhaps). I'm pretty sure that method is quicker in JavaScript. The reason it might be more efficient is that having the size function call inside the for loop means it would have to go call the function each time it runs the test to see if the loop is finished, instead of testing against a static value.

Ben Hayden
  • 1,349
  • 9
  • 15
0

The most optimal way to traverse a list is to use an Iterator.

for (Iterator iter=list.iterator(); iter.hasNext();) {
  System.out.println(iter.next());
}
kingsindian
  • 183
  • 3
  • You need to cast it to a string - `iter.next()` will return an object – OMG Ponies Dec 11 '09 at 05:27
  • 5
    @OMG Ponies: there's a definition for System.out.println(Object o), so no need to cast to a string - besides, the implementation will call String.valueOf(o), per the 1.5 api documentation. – atk Dec 11 '09 at 05:30
  • If youre iterating from the start to the end of a collection, using the built in for each loop is most certain as fast. – Viktor Sehr Jan 04 '10 at 23:50
0

In simple cases like this, there is most likely no difference and compiler produces exactly the same code (assuming you do not set the initial value when declaring the variable).

In more complex cases and longer functions, declaring local variable inside the loop or other block is likely to be more efficient. This shortens the lifetime of the variable thus making it easier for compiler to optimize the code. When a variable does not exist outside the block, the register used to store the variable can be used for other purposes.

This, of course, depends on the compiler implementation. I don't know about Java, but at least some C compiler manufacturers have given this recommendation in their documentation.

As for readability, my opinion is that in short functions it is better to declare all variables at the beginning where they can be easily found. In very long functions (which should be avoided anyway), it may be better to declare the variable inside a block (which just happens to be more efficient, too).

PauliL
  • 1,259
  • 9
  • 7
0

Besides the useful suggestions given by others, please do keep in mind one thing:

never optimize early. If you really think your code is slow and might need improvement, then use a profiler for your code to spot where the bottlenecks are, and only then do refactor them. Learn where the mistake was. Do not repeat mistake next time.

In your case I would say that, depending on the java VM version, your performance (guess what) might vary. Out of experience I'd not declare a variable within a loop; an int will certainly be optimized out by the compiler and re-use the same memory address, and the extra computational cost might be negligible.

But.

If you were declaring an object inside a loop, then things will be different. What if your object, when created, does an I/O write? A network DNS lookup? You might not know/care. So, best practice is declare it ouside.

Also, do not mix up performance with best practice. They might lead you into dangerous territory.

lorenzog
  • 3,483
  • 4
  • 29
  • 50
0

The Java compiler determines how many "slots" it needs in the stack frame based on how you use local variables. This number is determined based on the maximum number of active locals at any given point in the method.

If your code is only

int value;
for (...) {...}

or

for (...) {
   int value;
   ...
}

there's only one slot needed on the runtime stack; the same slot can be reused inside the loop regardless of how many times the loop runs.

However, if you do something after the loop that requires another slot:

int value;
for (...) {...}
int anotherValue;

or

for (...) {
   int value;
   ...
}
int anotherValue;

we'll see a difference - the first version requires two slots, as both variables are active after the loop; in the second example, only one slot is required, as we can reuse the slot from "value" for "anotherValue" after the loop.

Note: the compiler can be smarter about optimizations depending on how the data is actually used, but this simple example is meant to demonstrate that there can be a difference in stack frame allocation.

Scott Stanchfield
  • 29,742
  • 9
  • 47
  • 65