I try to run an inference using a cli to get the predictions from a detection and recognition model. With cuda10.2 it takes 15 mins for the inference to complete but I have cuda11.3 which takes 3 hours, I want to reduce this time. Note : My hardware does not support cuda10.2.
hence I have following packages installed,
- cudatoolkit 11.3.1 h2bc3f7f_2
- pytorch 1.10.0 py3.7_cuda11.3_cudnn8.2.0_0 pytorch
- torchvision 0.11.0 py37_cu113 pytorch
I get this error while I run the inference cli,
RuntimeError: CUDA out of memory. Tried to allocate 2.05 GiB (GPU 0; 5.81 GiB total capacity; 2.36 GiB already allocated; 1.61 GiB free; 2.38 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Tried :
- To change the batch_size both for detection and recognition
Kindly help!
Thank you.