0

I have a nested loop in one of my scripts that, when running, leads to MemoryError. It looks something like this:

jobRange = range(605)
a = []
for i in jobRange:
    for k in jobRange:
        for j in jobRange:
            if i != k and k != j:
                a.append((i, k, j))

I tried to optimize it by replacing the horrible nested loop with the use of permutations:

a = []
for p in permutations(jobRange, 3):
    i, k, j = p[0], p[1], p[2]
    a.append((i, k, j))

That does not however solve the problem. I still get:

Traceback (most recent call last):
  File "C:/Users/Vejdanpa/PycharmProjects/myProject/Models/test.py", line 10, in <module>
    a.append((i, k, j))
MemoryError

I also tried the following super-slow code just to find out how much memory this piece of code is using that leads to MemoryError:

from itertools import permutations
import tracemalloc

tracemalloc.start()
jobRange = range(605)

a = []
for p in permutations(jobRange, 3):
    i, k, j = p[0], p[1], p[2]
    a.append((i, k, j))
    current, peak = tracemalloc.get_traced_memory()
    print(f"Current memory usage: {current / 10 ** 6}MB; Peak: {peak / 10 ** 6}MB")

tracemalloc.stop()

The last couple of output lines before it throws the error was:

Current memory usage: 1022.68617MB; Peak: 1022.686298MB
Current memory usage: 1022.68621MB; Peak: 1022.686338MB
Current memory usage: 1022.68625MB; Peak: 1022.686378MB
Current memory usage: 1022.68629MB; Peak: 1022.686418MB
Current memory usage: 1022.68633MB; Peak: 1022.686458MB
Current memory usage: 1022.68637MB; Peak: 1022.686498MB
Current memory usage: 1022.68641MB; Peak: 1022.686538MB
Current memory usage: 1022.68645MB; Peak: 1022.686578MB
Current memory usage: 1022.68649MB; Peak: 1022.686618MB
Current memory usage: 1022.68653MB; Peak: 1022.686658MB
Traceback (most recent call last):
  File "C:/Users/Vejdanpa/PycharmProjects/myProject/Models/test.py", line 10, in <module>
    a.append((i, k, j))
MemoryError

Process finished with exit code 1

To my understanding this suggests that the threshold is somehow set to ~ 1GB and my program runs out of memory because it needs more that 1GB. I checked my machine's specs: machine specs. It shows that I have 16GB ram and as far as I know Python doesn't, in any way, limit the use of memory and should run until it runs out of memory.

I'm currently running the code in Python 3.7 in PyCharm, but a simple try on the command-line gave the same result.

Could someone please help me understand how/why there is 1GB limit on the memory that my python script can consume, and possibly, how to increase such limit?

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
Vejdanpa
  • 9
  • 1

2 Answers2

1

I'd add one remark to @juanpa.arrivillaga's comment.
If you just want to get stuff into the list you may just go with: list(permutations(jobRange, 3)) but there's a reason why that may not be a good idea.

Warning This is a backhand calculation and it may be system dependent!

I looked up the iterootls.premutation docs and the number of results returned by the function is n! / (n-r)!, in your case making it 605!/(605-3)!, which simplifies to 605 x 604 x 603.
I assume that each int is at least 28 bytes, I got this number by running sys.getsizeof(1) on my system -- reason: see here.

You're appending to a list which grows over the run time of the programme. So, the final size of the list, or rather, a lower bound on the size of the list itself (disregarding everything else) would be:

605 x 604 x 603 x 28 x 3 ~= 18 509 MB ~= 18.5 GB

That would likely exceed your memory and that's just the size of the list.

This may not be a totally accurate estimate of a lower bound, I find any estimates of memory usage in Python quite difficult.

LemurPwned
  • 710
  • 10
  • 19
  • 1
    that is a nice analysis, but OP does not seem to cross 1 GB – python_user Feb 08 '21 at 10:13
  • 1
    @LemurPwned you forgot to take in account the 3-tuple holding the integers, that's 64 bytes (~14GB). Also the integers cost a bit less than 28 bytes on average because CPython has a *small integer cache* for those below 256 (inclusive). That represents about 43% of the integers being used, so integers cost an average of 16 bytes (28 * .57) for this specific workload. – Masklinn Feb 08 '21 at 10:17
  • 1
    And the cost of the integers can easily be obviated by making `jobRange` a `list` instead, then the same integers will be reused every time instead of the `range` object recreating them on each iteration: only ~350 integers would need to be allocated for a basically irrelevant 10kB. – Masklinn Feb 08 '21 at 10:17
  • @Masklinn, that's a fair point! I totally missed this aspect – LemurPwned Feb 08 '21 at 10:20
  • 1
    Note, the list will use (at least) `len(mylist)*word_length` bytes to store the actually array of PyObject pointers. So, for a length 100 list on a 64-bit system, that's 100*8 bytes – juanpa.arrivillaga Feb 08 '21 at 10:27
  • @juanpa.arrivillaga yes, but here it's 10k for the memoisation of the range and low-hundred megs for the entire thing, which is inconsequential given the payload is in the tens of GB. Though you're right that it *can* be very relevant in other situations. – Masklinn Feb 08 '21 at 10:53
0

If you run the program under the Fil memory profiler (https://pythonspeed.com/fil) it will use a heuristic to dump out a memory profile report right before you run out of memory, so you can see how much memory was used. See example here: https://pythonspeed.com/articles/crash-out-of-memory/

Itamar Turner-Trauring
  • 3,430
  • 1
  • 13
  • 17