6

I'm using cupy in a function that receives a numpy array, shoves it on the GPU, does some operations on it and returns a cp.asnumpy copy of it.

The problem: The memory is not freed after the function (as seen in ndidia-smi).

I know about the caching and re-using of memory done by cupy. However, this seems to work only per-user. When multiple users are computing on the same GPU-server, they are limited by the cached memory of other users.

I also tried calling cp._default_memory_pool.free_all_blocks() inside the function at the end. This seems to have no effect. Importing cupy in the main code and calling free_all_blocks "manually" works, but I'd like to encapsulate the GPU stuff in the function, not visible to the user.

Can you fully release GPU memory used inside a function so that it's usable by other users?


Minimal example:

Main module:

# dont import cupy here, only numpy
import numpy as np

# module in which cupy is imported and used
from memory_test_module import test_function

# host array
arr = np.arange(1000000)

# out is also on host, gpu stuff happens in test_function
out = test_function(arr)

# GPU memory is not released here, unless manually:
import cupy as cp
cp._default_memory_pool.free_all_blocks()

Function module:

import cupy as cp

def test_function(arr):
    arr_gpu = cp.array(arr)
    arr_gpu += 1
    out_host = cp.asnumpy(arr_gpu)

    # this has no effect
    cp._default_memory_pool.free_all_blocks()

    return out_host
clemisch
  • 967
  • 8
  • 18

1 Answers1

11

CuPy uses Python's reference counter to track which arrays are in use. In this case, you should del arr_gpu before calling free_all_blocks in test_function.

See here for more details: https://docs.cupy.dev/en/latest/user_guide/memory.html

kmaehashi
  • 879
  • 6
  • 10
  • This does indeed work. I didn't think of `del` because it does not free the memory in the main shell on its own normally. Thanks! – clemisch Nov 29 '18 at 13:32
  • What happens if you don't call del arr_gpu or arr_gpu=None? If you don't call that and instead just call free_all_blocks(), does the memory get freed or will you still have out-of-memory exceptions? The documentation doesn't seem to talk about this. – Goku Jan 29 '22 at 07:03
  • 1
    `free_all_blocks` frees all memory blocks for `ndarray`s garbage-collected by Python. When there is an alive reference to a ndarray, the corresponding memory block cannot be freed. – kmaehashi Jan 31 '22 at 10:26