60

Is there any performance penalty for the following code snippet?

for (int i=0; i<someValue; i++)
{
    Object o = someList.get(i);
    o.doSomething;
}

Or does this code actually make more sense?

Object o;
for (int i=0; i<someValue; i++)
{
    o = someList.get(i);
    o.doSomething;
}

If in byte code these two are totally equivalent then obviously the first method looks better in terms of style, but I want to make sure this is the case.

Yuval Adam
  • 161,610
  • 92
  • 305
  • 395
  • As an aside - in C#, it is possible for the two to have very different meanings: if "o" is captured into an anonymous method/lambda. Not posted as a reply, since this is java ;-p – Marc Gravell Dec 18 '08 at 13:05
  • What if inside the loop we have: Object o = new Object(i) ? – serg Dec 21 '08 at 17:38
  • Duplicate question: http://stackoverflow.com/questions/110083/which-loop-has-better-performance-why#110389 – erickson Jan 01 '09 at 18:58
  • 1
    +1. Thanks for ansking I was just about to ask the same. – radbyx Feb 28 '12 at 22:08
  • If only this post didn't have the tag `java`. To what extent can we apply the answers here to cases of other languages such as `g++` for C++? – ynn Feb 01 '20 at 09:58
  • Does this answer your question? [Difference between declaring variables before or in loop?](https://stackoverflow.com/questions/407255/difference-between-declaring-variables-before-or-in-loop) – user202729 Oct 17 '22 at 14:07

16 Answers16

52

In today's compilers, no. I declare objects in the smallest scope I can, because it's a lot more readable for the next guy.

Dave Markle
  • 95,573
  • 20
  • 147
  • 170
  • I have a doubt. If I initialize an object inside a FOR loop, for large number of data, the iterations of loops will be in thousands and the heap would store thousands of objects before garbage collection takes place. Second case I initialize the object outside FOR loop and reuse the object by clearing the data and adding data for each iteration. Which one is better for memory? Creating an object for every iteration or create once outside the loop and use it for every iteration. – Naveen Kumar Jan 29 '20 at 11:44
17

To quote Knuth, who may be quoting Hoare:

Premature optimization is the root of all evil.

Whether the compiler will produce marginally faster code by defining the variable outside the loop is debatable, and I imagine it won't. I would guess it'll produce identical bytecode.

Compare this with the number of errors you'll likely prevent by correctly-scoping your variable using in-loop declaration...

Dan Vinton
  • 26,401
  • 9
  • 37
  • 79
11

There's no performance penalty for declaring the Object o within the loop. The compiler generates very similar bytecode and makes the correct optimizations.

See the article Myth - Defining loop variables inside the loop is bad for performance for a similar example.

Marco Tolk
  • 842
  • 1
  • 10
  • 26
7

You can disassemble the code with javap -c and check what the compiler actually emits. On my setup (java 1.5/mac compiled with eclipse), the bytecode for the loop is identical.

Rolf Rander
  • 3,221
  • 20
  • 21
  • It won't be identical in a larger method; the valid scope of the variable is recorded in the local variable table. – erickson Jan 01 '09 at 19:00
5

The first code is better as it restricts scope of o variable to the for block. From a performance perspective, it might not have any effects in Java, but it might have in lower level compilers. They might put the variable in a register if you do the first.

In fact, some people might think that if the compiler is dumb, the second snippet is better in terms of performance. This is what some instructor told me at the college and I laughed at him for this suggestion! Basically, compilers allocate memory on the stack for the local variables of a method just once at the start of the method (by adjusting the stack pointer) and release it at the end of method (again by adjusting the stack pointer, assuming it's not C++ or it doesn't have any destructors to be called). So all stack-based local variables in a method are allocated at once, no matter where they are declared and how much memory they require. Actually, if the compiler is dumb, there is no difference in terms of performance, but if it's smart enough, the first code can actually be better as it'll help the compiler understand the scope and the lifetime of the variable! By the way, if it's really smart, there should no absolutely no difference in performance as it infers the actual scope.

Construction of a object using new is totally different from just declaring it, of course.

I think readability is more important that performance and from a readability standpoint, the first code is definitely better.

Mehrdad Afshari
  • 414,610
  • 91
  • 852
  • 789
  • I am not sure about allocation pattern you describe. Consider this code: if (flag) {char buffer[100];...} else {char buffer[2000];...} IN this case it would be really dumb for teh compiler to allocate all memory in advance. Doubly so if the stack is at a premium (embedded apps). –  Dec 18 '08 at 14:35
  • Smart compilers usually allocate it by finding the maximum unshared memory in every code path. In your example, it's 2000 bytes which they allocate. If you didn't use it, it's not really costly, since it's just add esp, 2000 instead of add esp, 200. – Mehrdad Afshari Dec 18 '08 at 15:29
4

I've got to admit I don't know java. But are these two equivalent? Are the object lifetimes the same? In the first example, I assume (not knowing java) that o will be eligible for garbage collection immediately the loop terminates.

But in the second example surely o won't be eligible for garbage collection until the outer scope (not shown) is exited?

Paul Mitchell
  • 3,241
  • 1
  • 19
  • 22
4

Don't prematurely optimize. Better than either of these is:

for(Object o : someList) {
    o.doSomething();
}

because it eliminates boilerplate and clarifies intent.

Unless you are working on embedded systems, in which case all bets are off. Otherwise, don't try to outsmart the JVM.

SamBeran
  • 1,944
  • 2
  • 17
  • 24
1

I've always thought that most compilers these days are smart enough to do the latter option. Assuming that's the case, I would say the first one does look nicer as well. If the loop gets very large, there's no need to look all around for where o is declared.

Will Mc
  • 244
  • 2
  • 9
1

These have different semantics. Which is more meaningful?

Reusing an object for "performance reasons" is often wrong.

The question is what does the object "mean"? WHy are you creating it? What does it represent? Objects must parallel real-world things. Things are created, undergo state changes, and report their states for reasons.

What are those reasons? How does your object model and reflect those reasons?

S.Lott
  • 384,516
  • 81
  • 508
  • 779
  • There is no reuse of objects going on in those two snippets at all, just reuse of reference variables. – Ilja Preuß Dec 18 '08 at 18:16
  • A reference is -- to my mind -- an object, first class, the real deal. Where it's declared and how it's declared reflects the meaning, purpose and intent behind the object. – S.Lott Dec 18 '08 at 21:42
  • It might be the same conceptually, but it's a very different beast in Java and confusing references and objects has slowed down many, many beginners in their understanding. – Joachim Sauer Dec 21 '08 at 12:30
  • @saua: True. As shown in this case of moving the reference declaration around randomly as an "optimization". – S.Lott Dec 21 '08 at 12:53
1

To get at the heart of this question... [Note that non-JVM implementations may do things differently if allowed by the JLS...]

First, keep in mind that the local variable "o" in the example is a pointer, not an actual object.

All local variables are allocated on the runtime stack in 4-byte slots. doubles and longs require two slots; other primitives and pointers take one. (Even booleans take a full slot)

A fixed runtime-stack size must be created for each method invocation. This size is determined by the maximum local variable "slots" needed at any given spot in the method.

In the above example, both versions of the code require the same maximum number of local variables for the method.

In both cases, the same bytecode will be generated, updating the same slot in the runtime stack.

In other words, no performance penalty at all.

HOWEVER, depending on the rest of the code in the method, the "declaration outside the loop" version might actually require a larger runtime stack allocation. For example, compare

for (...) { Object o = ... }
for (...) { Object o = ... }

with

Object o;
for (...) {  /* loop 1 */ }
for (...) { Object x =...; }

In the first example, both loops require the same runtime stack allocation.

In the second example, because "o" lives past the loop, "x" requires an additional runtime stack slot.

Hope this helps, -- Scott

Scott Stanchfield
  • 29,742
  • 9
  • 47
  • 65
0

When using multiple threads (if your doing 50+) then i found this to be a very effective way of handling ghost thread problems:

Object one;
Object two;
Object three;
Object four;
Object five;
try{
for (int i=0; i<someValue; i++)
{
o = someList.get(i);
o.doSomething;
}
}catch(e){
e.printstacktrace
}
finally{
one = null;
two = null;
three = null;
four = null;
five = null;
System.gc();
}
Petro
  • 3,484
  • 3
  • 32
  • 59
  • Are you declaring them null to tell the GC they're explicitly ready for collection? – Harry May 22 '18 at 23:26
  • 1
    @Harry This was a few years ago, but ya it did work as intended. `System.gc()` doesn't always run immediately though. At least you can assign `null` to ensure proper cleanup. – Petro May 24 '18 at 14:48
0

The first makes far more sense. It keeps the variable in the scope that it is used in. and prevents values assigned in one iteration being used in a later iteration, this is more defensive.

The former is sometimes said to be more efficient but any reasonable compiler should be able to optimise it to be exactly the same as the latter.

Jack Ryan
  • 8,396
  • 4
  • 37
  • 76
  • With respect, I think you may have that flipped around. It's the former that keeps the variable in the scope in which it is used. – Will Wagner Dec 18 '08 at 13:10
  • -1: "makes sense" depends on what the program is supposed to "mean". Objects exist for a reason. What is that reason? – S.Lott Dec 18 '08 at 13:11
  • Will you were right I had flipped it around. Fixed now. S. Lott: in this example I think it is obvious that the object exists to hold a value that is got from a list. It serves no purpose outside the loop. – Jack Ryan Dec 18 '08 at 14:31
0

In both cases the type info for the object o is determined at compile time.In the second instance, o is seen as being global to the for loop and in the first instance, the clever Java compiler knows that o will have to be available for as long as the loop lasts and hence will optimise the code in such a way that there wont be any respecification of o's type in each iteration. Hence, in both cases, specification of o's type will be done once which means the only performance difference would be in the scope of o. Obviously, a narrower scope always enhances performance, therefore to answer your question: no, there is no performance penalty for the first code snip; actually, this code snip is more optimised than the second.

In the second snip, o is being given unnecessary scope which, besides being a performance issue, can be also a security issue.

Lonzo
  • 2,758
  • 4
  • 22
  • 27
0

As someone who maintains more code than writes code.

Version 1 is much preferred - keeping scope as local as possible is essential for understanding. Its also easier to refactor this sort of code.

As discussed above - I doubt this would make any difference in efficiency. In fact I would argue that if the scope is more local a compiler may be able to do more with it!

Fortyrunner
  • 12,702
  • 4
  • 31
  • 54
-1

The answer depends partly on what the constructor does and what happens with the object after the loop, since that determines to a large extent how the code is optimized.

If the object is large or complex, absolutely declare it outside the loop. Otherwise, the people telling you not to prematurely optimize are right.

Max
  • 1,044
  • 10
  • 19
-3

I've actually in front of me a code which looks like this:

for (int i = offset; i < offset + length; i++) {
    char append = (char) (data[i] & 0xFF);
    buffer.append(append);
}
...
for (int i = offset; i < offset + length; i++) {
    char append = (char) (data[i] & 0xFF);
    buffer.append(append);
}
...
for (int i = offset; i < offset + length; i++) {
    char append = (char) (data[i] & 0xFF);
    buffer.append(append);
}

So, relying on compiler abilities, I can assume there would be only one stack allocation for i and one for append. Then everything would be fine except the duplicated code.

As a side note, java applications are known to be slow. I never tried to do profiling in java but I guess the performance hit comes mostly from memory allocation management.

annoyed
  • 51
  • 4
  • 2
    This isn't an answer to the question. – sharakan Apr 17 '13 at 14:30
  • @annoyed you made two statements with no supporting evidence, neither relevant to the question. Your first regarding stack allocations I don't know whether it's correct or not. Your second regarding java performance is certainly wrong. One could say that **some** java applications are slow, and **sometimes** this is because of memory allocation management, but that is arguably true for any program in any language that allocates and frees memory. – sharakan Apr 19 '13 at 16:55
  • @sharakan, Thanks but I was just asking Hot Licks to elaborate a little. And you're right, this is not exactly an answer, more like a synthesis of what has been said before, about which i'm still skeptical. – annoyed Apr 23 '13 at 14:38