2

I have a small python code that uses min(list) multiple times on the same unchanged list, this got me wondering if I use the result of functions like min(), len(), etc... multiple times on an unchanged list is it better for me to store those results in variables, does it affect memory usage / performance at all?

If I had to guess I'd say that if a function like min() gets called many times it'd be better for performance to store it in a variable, but I am not sure of this since I don't really know how python gets the value or if python automatically stores this value somewhere as long as the list isn't changed.

Timur Shtatland
  • 12,024
  • 2
  • 30
  • 47
ivan
  • 1,177
  • 8
  • 23
  • How much are you calling it? – CATboardBETA Jan 27 '21 at 02:00
  • Even if i call it just twice there should be a difference on both cases (or atleast i think so), so that doesn't really matter, lets say i call it one thousand times for an example where it'd greatly affect if there was any difference. – ivan Jan 27 '21 at 02:01
  • 1
    Typically it is best to store it in a variable. You have to benchmark it to verify how big the performance savings are, though. In my experience, for short lists that are not inside loops it does not matter, compared to other operations, such as I/O. – Timur Shtatland Jan 27 '21 at 02:01
  • @TimurShtatland I see so does this mean that python actually re-calculates the value every time a function like min/len gets called ? – ivan Jan 27 '21 at 02:03
  • It actually makes no sense that Python would store the return value. It does something similar for regexes, but there it's immutable strings, not some mutable thing. It'd have to keep track of all the items of the list to store any return value. – user Jan 27 '21 at 02:03
  • @user I know that wouldn't make much sense however since I don't really know if that's the case or not with certainty I wouldn't like to guess – ivan Jan 27 '21 at 02:05
  • 1
    Lists are mutable, and `min` is a regular name, so there's no guarantee that either the function or the list is the same each time you call `min(list)`. The compiler doesn't even try to keep track if it is even correct to reuse the last return value. – chepner Jan 27 '21 at 02:30
  • 1
    this is a good time to remind folk of [premature optimization is the root of all evil](https://en.wikiquote.org/wiki/Donald_Knuth). it might have been written with caching the results of calling a **cheap** function on a mutable object in mind. basically wot @chepner is saying. get it working first, profile it next. if you **know** your min() is only firing after the list is somehow frozen then *maybe* optimize with caching. most answers advising otherwise are inviting bugs in a complex system. the key point is does that min() show up in your profiling as a problem? – JL Peyret Jan 27 '21 at 03:50

5 Answers5

3

Speed

It is almost always cheaper to store the result and re-use it rather than re-call the function multiple times.

Python does not cache (store and later remember) results from functions like min(), len(), etc.

Here is a quick speed test:

timeit.timeit("c = min(x) + min(x)", "x = [1, 2, 3]")
0.24990593400000005

timeit.timeit("a = min(x); b = a + a", "x = [1, 2, 3]")
0.1296667110000005

The second is almost twice as fast, because storing a variable is much cheaper than re-calling the min function.

Memory use

If the result is a single number, as with min() or len(), then memory use is negligible.

If the result is something substantial (e.g. a large table of values), then you can remove it when you're done with it using del

large_object = expensive_function()
do_something(large_object)
do_something_else(large_object)
del large_object

Also, large objects will automatically be deleted from memory when they fall out of scope (e.g. when a function returns) or when garbage collection rounds happen at regular intervals. For this reason, del is only necessary in certain circumstances like when dealing with circular references to an object.

Pi Marillion
  • 4,465
  • 1
  • 19
  • 20
  • I want to upvote this, but your section on memory is misleading. Keeping a variable around *is a constant cost*, in CPython, the cost is that of a PyObject pointer, so a machine word, or 8 bytes usually (on a 64bit system). The memory use is pretty much always negligible, because assigning the result to a variable does not create a copy. – juanpa.arrivillaga Jan 27 '21 at 02:38
  • @juanpa.arrivillaga Assigning the value to a variable does not use much additional memory at the point of assignment, but it then keeps the object in memory (prevents it from being garbage collected) until the variable falls out of scope (usually at the end of a function call or script execution). The possible substantial memory use is in the time between assigning to a variable and when that variable falls out of scope or is `del`-eted. – Pi Marillion Jan 27 '21 at 02:42
  • I know, I'm sure *you* understand, but I worry people who read it may get the wrong impression, and I just speak from experience with the various misunderstandings about python memory management that abound – juanpa.arrivillaga Jan 27 '21 at 02:47
  • this is exactly what I was looking for thanks, I know that in most cases it wont affect at all to change such a thing, however i was still curious about what impact would it actually have. – ivan Jan 27 '21 at 07:02
3

min() is very fast compared to many other operations, such as I/O. So the efficiency improvements could be small for short lists and only a few repeated calls. However, if you cache the results of min(), you can realize some time savings. See the code below for examples of time you can actually save. As you can see, you need multiple iterations of the loops that contain min() calls to get any substantial the time savings.

import timeit

lst = range(2)

def test_min():
    x = [min(lst) for i in range(10)]

def test_cached_min():
    min_lst = min(lst)
    x = [min_lst for i in range(10)]

print(timeit.timeit("test_min()", globals = locals(), number = 1000))
print(timeit.timeit("test_cached_min()", globals = locals(), number = 1000))

# lst = range(2):
# 0.0027015960000000006
# 0.0010772920000000005

# lst = range(2000):
# 0.5262554810000001
# 0.05257684900000004
Timur Shtatland
  • 12,024
  • 2
  • 30
  • 47
2

functions like min or max definitely have to traverse the array each time (giving them a complexity of O(n)). So yeah, specially if your array is larger, it's a better idea to store it in a variable rather than performing the calculation again.

More details about performance in another question

Fabio Lopez
  • 517
  • 2
  • 5
  • 15
1

Time Complexity of Python List Operations

Complexity of List Operations

Source

The table shows that:

  • function len (to get length) has complexity O(1) (so very fast, so already stored)
  • function min (to get minimum) is O(n) (depends upon size of list, so computed each time).

This means that:

  • len does not need to be stored for reuse
  • min should be stored for reuse (especially for large lists)
DarrylG
  • 16,732
  • 2
  • 17
  • 23
0

If you are only using it 1-5 times, it doesn't really matter. But if you are going to call it anymore, and really less too, it is best to just save it as a variable. It will take next to no memory, and very little time to do so and to pull it from memory.

CATboardBETA
  • 418
  • 6
  • 29