Get total amount of free GPU memory and available using pytorch

Question

I'm using google colab free Gpu's for experimentation and wanted to know how much GPU Memory available to play around, torch.cuda.memory_allocated() returns the current GPU memory occupied, but how do we determine total available memory using PyTorch.

prosti · Answer 1 · 2021-07-22T20:41:19.553

PyTorch can provide you total, reserved and allocated info:

t = torch.cuda.get_device_properties(0).total_memory
r = torch.cuda.memory_reserved(0)
a = torch.cuda.memory_allocated(0)
f = r-a  # free inside reserved

Python bindings to NVIDIA can bring you the info for the whole GPU (0 in this case means first GPU device):

from pynvml import *
nvmlInit()
h = nvmlDeviceGetHandleByIndex(0)
info = nvmlDeviceGetMemoryInfo(h)
print(f'total    : {info.total}')
print(f'free     : {info.free}')
print(f'used     : {info.used}')

_{pip install pynvml}

You may check the nvidia-smi to get memory info. You may use nvtop but this tool needs to be installed from source (at the moment of writing this). Another tool where you can check memory is gpustat (pip3 install gpustat).

If you would like to use C++ cuda:

include <iostream>
#include "cuda.h"
#include "cuda_runtime_api.h"
  
using namespace std;
  
int main( void ) {
    int num_gpus;
    size_t free, total;
    cudaGetDeviceCount( &num_gpus );
    for ( int gpu_id = 0; gpu_id < num_gpus; gpu_id++ ) {
        cudaSetDevice( gpu_id );
        int id;
        cudaGetDevice( &id );
        cudaMemGetInfo( &free, &total );
        cout << "GPU " << id << " memory: free=" << free << ", total=" << total << endl;
    }
    return 0;
}

`torch.cuda.memory_cached` has been renamed to `torch.cuda.memory_reserved` — Kallzvx, Jan 04 '21 at 14:56
Note: total_memory + reserved/allocated does not work well when memory is allocated by other users/processes. — krassowski, May 19 '22 at 22:36
use `import pynvml` instead of `from pynvml import *`, else this may cause conflict with other code. For example, modeling_roberta.py throws `TypeError: '_ctypes.UnionType' object is not subscriptable`. `pynvml.nvmlInit()`, `h = pynvml.nvmlDeviceGetHandleByIndex(0)`, `info = pynvml.nvmlDeviceGetMemoryInfo(h)` — user2585501, Feb 27 '23 at 12:50

Iman · Accepted Answer · 2023-04-14T22:41:44.123

22

In the recent version of PyTorch you can also use torch.cuda.mem_get_info:

https://pytorch.org/docs/stable/generated/torch.cuda.mem_get_info.html#torch.cuda.mem_get_info

torch.cuda.mem_get_info()

It returns a tuple where the first element is the free memory usage and the second is the total available memory.

edited Apr 14 '23 at 22:41

answered Apr 01 '22 at 19:23

Iman

1,017
10
26

3

This is better than the accepted answer (using `total_memory` + reserved/allocated) as it provides correct numbers when other processes/users share the GPU and take up memory. – krassowski May 19 '22 at 22:36
1

In older versions of pytorch, this is buggy, it ignores the device parameter and always returns current device info. The workaround is to use this with a context manager: `with torch.cuda.device(device):` `info = torch.cuda.mem_get_info()` see: [https://github.com/pytorch/pytorch/issues/76224](https://github.com/pytorch/pytorch/issues/76224) – אלימלך שרייבר Dec 28 '22 at 15:50
Example usage please – Nathan B Apr 10 '23 at 07:58
@NathanB Added example usage. – Iman Apr 14 '23 at 22:42

score 4 · Answer 3 · answered Aug 27 '21 at 03:21

4

This is useful for me!

def get_memory_free_MiB(gpu_index):
    pynvml.nvmlInit()
    handle = pynvml.nvmlDeviceGetHandleByIndex(int(gpu_index))
    mem_info = pynvml.nvmlDeviceGetMemoryInfo(handle)
    return mem_info.free // 1024 ** 2

answered Aug 27 '21 at 03:21

Peter Pack

81
3

Get total amount of free GPU memory and available using pytorch

3 Answers3

Linked