When should I store the result of a function as a variable in python?

Question

Suppose a function my_list(obj) returns a list. I want to write a function that returns the single element of my_list(obj) if this list has length one and False otherwise. My code is

def my_test(obj):
    if len(my_list(obj)) == 1:
        return my_list(obj)[0]
    return False

It just dawned on me that the code

def my_test(obj):
    L = my_list(obj)
    if len(L) == 1:
        return L[0]
    return False

might be more efficient since it only calls my_list() once. Is this true?

The function my_list() could possibly be computationally intensive so I'm curious if there is a difference between these two blocks of code. I'd happily run a test myself but I'm not quite sure how to do so. I'm also curious if it is better practice in general to store the result of a function as a variable if the function is going to be called more than once.

That is correct. If `my_list()` takes a lot of time to execute it would be better to store the results. — Wolph, Aug 17 '16 at 19:53
Wolph's comment is correct. Also, for guidance on how to test it yourself see this question: http://stackoverflow.com/questions/2866380/how-can-i-time-a-code-segment-for-testing-performance-with-pythons-timeit — SnoringFrog, Aug 17 '16 at 19:56
And it also matters if my_list has some side effect or just isn't idempotent. For example if it's a generator which returns lists. Or if obj can change between invocations. — featuredpeow, Aug 17 '16 at 19:56
In almost all cases you don't need to worry about performance and should write whatever code is more readable. Only going back to optimise when you need to. If `my_list` is an expensive call then it would be worth saving it in a variable. This is a handy decorator as well: https://docs.python.org/3/library/functools.html#functools.lru_cache — freebie, Aug 17 '16 at 20:01
@freebie This link is incredibly enlightening for me. I've been working on another piece of code where I was essentially trying to lru_cache myself (I'm a mathematician writing code for my research and know practically nothing about computer science). Thanks for the help! — Brian Fitzpatrick, Aug 17 '16 at 21:14
@BrianFitzpatrick Glad it helps; I almost didn't include it. I'm a computer scientist and know practically nothing about mathematics. Nice to meet you. — freebie, Aug 17 '16 at 21:27

score 3 · Accepted Answer · edited May 23 '17 at 12:10

You are correct. The second block would be more efficient since it only calls my_list() once. If my_list() isn't particularly computationally expensive, it's unlikely you will notice a difference at all. If you know it will be expensive, on the other hand, it is a good idea to save the result where you can if it does not hamper readability (however, note the caveat in @Checkmate's answer about memory for a possible exception).

However, if my_list() has side effects, or if it's return value may change between those two invocations, you may not want to save it (depends on if you want to trigger the side effects twice or need to receive the changed return value).

If you wanted to test this yourself, you could use time.time like this:

import time

t0 = time.time()
my_test()
t1 = time.time()

total = t1-t0

to get the time for my_test(). Just run both functions and compare their time.

Don't use `time.time` to benchmark, it's not designed for that and might not be accurate on all platforms. Instead, the (`timeit`)[https://docs.python.org/3.5/library/timeit.html] module was designed for the purpose. — RoadieRich, Aug 18 '16 at 15:01

score 2 · Answer 2 · answered Aug 17 '16 at 20:05

2

To answer your question on whether it is generally better to store the result of a function as a variable if it is going to be called more than once: it depends. In terms of readability, it's really up to you as a programmer. In terms of speed, storing the result in a variable is generally faster than running the function twice.

Storing the result can, however, uses memory, and if you're storing an unusually large variable, the memory usage can actually lead to longer running time than simply calling the function. Further, as noted above, running a function can do more than just storing the result in a variable, so running a function a different number of times can give a different result.

answered Aug 17 '16 at 20:05

Checkmate

1,074
9
16

That large variable is already *in* memory, though (because it's being returned from the function). Adding another reference to it will not substantially increase your memory footprint in a language like python, since you're just assigning the same in-memory object to another name. This isn't a situation where the data is copied. – Ian McLaird Aug 17 '16 at 21:27
@IanMcLaird Not quite. If you store the large value in a variable and then call the function a second time (as in the OP's example), you've used twice as much memory as if you had called the function twice (assuming that the function allocates a new chunk of memory, which will likely occur for cases where this is an issue). – Checkmate Aug 17 '16 at 21:46
The OP's first example doesn't store the value in a local. The OP's second example doesn't call the function twice. Yes, if you store the value and then recalculate it anyway, you have the worst of both worlds. Depending on the behavior of the garbage collection algorithm, calling the function twice may create twice the memory usage. Calling the function once and storing a reference to the returned value will never do that (in python). – Ian McLaird Aug 17 '16 at 21:52
@IanMcLaird Sorry, I was in a hurry and recalled the examples incorrectly. What I meant to say is that storing a variable (as in the OPs second example) can lead to high memory usage in the general case if the variable is kept in scope for a long time (since the result of calling the function will be immediately freed after the function is run otherwise). – Checkmate Aug 17 '16 at 21:57

When should I store the result of a function as a variable in python?

2 Answers2