1

When loading a huge amount of data, I want to prevent a MemoryError to arrive (because even with try: ... except MemoryError: # here a proper handling, sometimes it crashes!).

Is there a way of finding the free memory available for my running process (so that I can stop the loading when, say, less than 200 MB RAM is available for my process) ?


Remark: I tried with:

import psutil
psutil.virtual_memory()

... but this doesn't give the amount of memory available for the current running process.

Example : Let's assume psutil.virtual_memory() tells me there are 5 GB available, and that my current process uses 1.7 GB. The real available memory for my process is 300MB (Python 2.7 32 bits on Windows, 2 GB per process limit) and not 5 GB...

Basj
  • 41,386
  • 99
  • 383
  • 673
  • as explained in the answer here : http://stackoverflow.com/questions/3991257/memoryerror-hook-in-python, waiting for `MemoryError` to come is not a good idea, because then I would be very near a critical memory error (causing crash). – Basj Feb 07 '14 at 15:08

1 Answers1

1

I once had to go through a object oriented DB, deleting objects with redundant information. I was bundling the transactions, but the objects were arbitrarily large, and I was frequently running out of memory.

You can't really handle these exceptions because when you're out of memory, you can't recover.

Here's the pseudocode of my approach at encouraging garbage collection:

import gc

a = arbitrary_size_object_generator()

for obj in a:
    process(obj)
    del obj
    gc.collect()

This was incredibly slow, and I still wasn't guaranteed not to run out of memory.

I also tried alternating garbage collection:

count = 0
for obj in a:
    count += 1
    process(obj)
    del obj
    if count % 10 == 0:
        gc.collect()

This was much faster, but I got less done before running out of memory, as it happened more frequently.

Nevertheless, I would still run out of memory, until I stopped attempting to bundle the transactions. Job done, that's where I left memory management with Python, and I haven't had to deal with it much since in this sort of context.

The only other context, I was careful to delete preliminary objects as I processed data, but it was a different sort of problem. I had a warning as the process ran out of memory, stating that the search scope was too broad, but I was using a special in-house memory reporter.

A bit of further digging based on the module you're using could also give you the info you're looking for:

import os
import psutil
p = psutil.Process(os.getpid())
>>> p.get_memory_info()
meminfo(rss=140165120, vms=125804544)
>>> p.get_memory_info().rss/1024 
136880
>>> p.get_memory_info().rss/1024/1024
133

And you could attempt to use this information to stop your process, delete objects, and call garbage collection over a certain threshold. But Python is certainly not advertised as a language that you want to worry about this sort of memory management with.

Russia Must Remove Putin
  • 374,368
  • 89
  • 403
  • 331