15

So I find myself doing something like the following pattern often. Instead of:

if (map.containsKey(someKey)) {
    Value someValue = map.get(someKey);
    ...
}

To not traverse the map twice (and since I know my map doesn't store nulls), I'll do:

Value someValue = map.get(someKey);
if (someValue != null) {
    ...
}

This seems to me to be a worthwhile pattern given that the Map operations perform a decent number of operations and the optimizer, I assume, is not smart enough to optimize it away.

But then I find myself doing similar patterns in other situations. For example, should I store the someMethod() result in a temporary variable instead of making the call twice? Clearly I can't call someMethod() twice if there are side-effects but when does it make sense to only call it once from an optimization standpoint?

if (someObject.someMethod() != null) {
    processSomehow(someObject.someMethod());
}

I know this borders on a "not constructive" question so I'm looking for answers that present some facts and references instead of just conjecture.

  • When does it make sense to do this and when does it not?
  • How should I evaluate the "cost" of the someMethod() to determine when a temporary variable should be used?
  • For simple methods such as get methods, might this get in the way of the hotspot compiler or the optimizer and actually produce less efficient code?

For posterity, I am not asking "how can I improve the speed of my existing program". I am trying to figure out "what pattern should I use when I write future code".

Thanks for any information.

Gray
  • 115,027
  • 24
  • 293
  • 354
  • 6
    I think the answer to all these questions involves the word "profiler"... – Oliver Charlesworth Feb 26 '12 at 18:24
  • 4
    Clearly if `someMethod()` is expensive or has side-effects you don't want to call it twice. Java does not (yet) contain an `idempotent` keyword (brings back old memories of `reducible` in PL/1 :-) – Jim Garrison Feb 26 '12 at 18:24
  • Good point @Jim. I've edited my question to make it more clear that I'm not talking about side-effects. But what I'm talking about is how to evaluate the "expense" of the method. – Gray Feb 26 '12 at 18:29
  • 6
    Optimization is a really bad reason for this. Readability can be a much better reason. –  Feb 26 '12 at 18:32
  • 1
    Optimization is a "really bad reason" @delnan? I'd say that's a overstatement. You are correct in terms of readability. – Gray Feb 26 '12 at 18:36
  • 2
    When the operation in question is tiny (like a single hashtable lookup), there is with no indication that it's in a bottleneck and no profiling happened, it *is* a bad reason. –  Feb 26 '12 at 18:40
  • 2
    Sorry but we disagree @delnan. Thanks for the feedback tho. – Gray Feb 26 '12 at 18:43
  • So you're find with worrying, for seconds or even minutes, about saving a millisecond in a method that's called once per hour? –  Feb 26 '12 at 18:45
  • It is seconds of coding and when did I say anything about the savings or call rates? – Gray Feb 26 '12 at 18:49
  • 2
    I think @delnan's point is that we're talking about premature optimization here. You implicitly said that you care about savings by talking about accidentally making the JIT produce less efficient code. Delnan is right to say that changing your code for optimization at the expense of readability is a bad reason *if there is no significant performance benefit*. In my opinion :) – Cameron Skinner Feb 26 '12 at 18:52
  • Thanks @Cameron. I understand "premature optimization". What I was looking for was a way of determining when it is premature and when the extra 5 seconds of coding time is a good investment. – Gray Feb 26 '12 at 19:07
  • Fair enough. I suspect it really comes down to a personal decision about when to trade off readability for speed. In the cases you're talking about the speed benefit is negligible, so I think you just need to decide which code is more readable. – Cameron Skinner Feb 26 '12 at 19:18

6 Answers6

12

How should I evaluate the "cost" of the someMethod() to determine when a temporary variable should be used?

Simply look at the implementation of someMethod(). If it just returns a field, like a typically getter method would do, there's no need to declare a local variable (performance wise). If the method creates objects, calls the database or reads file contents, then is usually wise to cache the return value.

Personally, I don't care of performance issues caused be declaring a temporary variable. I just keep the return value instead of calling a method twice. The code is much easier to read and to understand (unless you name all those variables just temp;-) )

Gray
  • 115,027
  • 24
  • 293
  • 354
Andreas Dolk
  • 113,398
  • 19
  • 180
  • 268
  • Couple good specifics (database, new object) there @Andreas but can you provide some more details about how you evaluate the cost of the a method. It's not as simple as "look"ing at it. – Gray Feb 26 '12 at 18:33
  • 2
    My threshold is pretty low. Everything more expensive then `return mValue;` will be cached ;) - btw, I believe, we can estimate the complexity/cost by just looking at the code. – Andreas Dolk Feb 26 '12 at 18:39
  • I was trying to get more details. "Everything more expensive then return mValue;" is what I was looking for. Thanks. – Gray Feb 26 '12 at 18:41
7

Clearly I can't call someMethod() twice if there are side-effects but when does it make sense to only call it once from an optimization standpoint?

When you've proved that it matters using a profiler. But get used to not calling methods twice in a row anyway, since it will make maintenance easier (what if the check changes and you forget to change it in both places?). Use temporary variables liberally, they're incredibly cheap.

Gray
  • 115,027
  • 24
  • 293
  • 354
Fred Foo
  • 355,277
  • 75
  • 744
  • 836
  • 2
    I think "liberally" may need to be qualified. Littering your methods with one-use variables also has its downsides. – Oliver Charlesworth Feb 26 '12 at 18:34
  • 1
    Actually, I tend to introduce one whenever I'd need to do an operation twice without the variable. – Fred Foo Feb 26 '12 at 18:36
  • I don't have time to profile my application every time I need to evaluate whether or not I need a temporary variable. Are there any patterns you look for @larsmans while you are coding? – Gray Feb 26 '12 at 18:39
  • @Gray: I just introduce the variable practically always. Variables aren't usually going to slow down your program, if they don't speed it up. – Fred Foo Feb 26 '12 at 18:42
  • Yeah, that's what I've been doing. Just curious as to other's patterns. Thanks @larsmans. – Gray Feb 26 '12 at 18:46
  • +1 for "When you've proved that it matters using a profiler". You can arbitrarily optimize (or attempt to optimize) just about any portion of your code and spend a lot of time and effort doing it, but if you're not optimizing the *actual* bottlenecks then it's a waste of time. Code readability is very important when you come back in 6 months time to fix a bug, or do some real optimization. – Cameron Skinner Feb 26 '12 at 18:48
6

As a general rule I use a local var when I need it more than once. You can give that local var a pretty name and everybody knowns what this var is about. From performance point of view: Local vars are cheap and methods maybe complex.

Especially when someMethod() is expensive like a synchronized method then this should perform much better. But when someMethod() is a synchronized method then it is intended to be called by multiple threads in concurrency and then it makes a difference to call the method twice or store its return value and reuse it ...

...Another point to mention is that subsequent method calls do not have to return the same data. So your examples with/without local var are not always valid substitutes for subsequent method calls. But I think you already considered this.

Gray
  • 115,027
  • 24
  • 293
  • 354
Fabian Barney
  • 14,219
  • 5
  • 40
  • 60
3

For me this all boils down to the logic of the program and not to performance. So

if (map.containsKey(someKey)) {
    Value someValue = map.get(someKey);
    ...
}

is not equivalent to that code in all cases:

Value someValue = map.get(someKey);
if (someValue != null) {
    ...
}

Consider a map where the value can be null for a given key. If the ... part is null safe then the code behaves differently.

Also this code with somehow has logical issues. Consider this:

BufferedRead reader = ...;
if (reader.readLine() != null) {
    processSomehow(reader.readLine());
}

Here it is clear that readLine must be called twice. Another problem domain for this kind of code is multithreading: The code expects the value not to change, but another thread might do excatly this.

Summary Keep your code as crisp and correct as possible. This means, that if you expect a value not to change and you need it multiple times, then use a variable to tell anybody this.

Gray
  • 115,027
  • 24
  • 293
  • 354
A.H.
  • 63,967
  • 15
  • 92
  • 126
  • I was not talking about logic @A.H.. I understand when I _need_ to use temporary variables. I was trying to figure out when it was recommended. – Gray Feb 26 '12 at 18:49
  • @Gray: I understood your point. My point is: Logic comes first. When logic dictates, that the same value is expected, then variables must be used. Therefore both of your examples are logically wrong or dubious. So: _not_ using variables is the real exception. – A.H. Feb 26 '12 at 18:55
  • My code is neither logically wrong nor dubious dude. I know that my `Map` either does not accept nulls or I never store them there. No one else seems to have needed that clarification. You answer missed the point of my question entirely. – Gray Feb 26 '12 at 19:00
2

Several people are quoting Knuth (either implicitly or explicitly), but I think this always makes sense as a design pattern (and cases where null is an allowable key, as pointed out by A.H. become that much more clear).

In my mind, it's similar to the case of passing a variable by const reference in C++ instead of by value. There are two reasons to do this:

  1. it communicates that the value won't change
  2. but the main reason it's done is that it's marginally faster.

It doesn't matter on most individual calls, but when every routine passes every parameter by value, you notice it. It's a good habit to be in. The case you mention isn't as common and so it isn't as important, but I'd argue it's still a good habit.

Gray
  • 115,027
  • 24
  • 293
  • 354
Ben Hocking
  • 7,790
  • 5
  • 37
  • 52
1

Reminds me of a discussion regarding similar .NET Dictionary usage style: What is more efficient: Dictionary TryGetValue or ContainsKey+Item?

In general, as pointed out in comments, you don't know the performance hot spots in your program until you run it with real data under a profiler.

Also, in general, compilers in imperative languages can not optimize repeated calls, because of the possible side effects. Every call to a function can return a new value. And by the way, that is also one of the reason why you yourself shouldn't rely on repeated calls. Today maybe you know that a function or a property is side-effect free and repeated calls return the same value, but tomorrow you change it and add some random generator inside and all logic with repeated calls will be broken.

Community
  • 1
  • 1
Massimiliano
  • 16,770
  • 10
  • 69
  • 112