0

I tried almost all the option to train the model including reducing batch size to 1 and some other steps as described here How do I select which GPU to run a job on?, But still i get the error RuntimeError: CUDA out of memory. Tried to allocate 238.00 MiB (GPU 3; 15.90 GiB total capacity; 15.20 GiB already allocated; 1.88 MiB free; 9.25 MiB cached) This is the notebook , configured in Azure ML workspace with N24-GPU

thank you

jaiswati_b
  • 21
  • 4

1 Answers1

0

Check your memory usage before you start training, sometimes detectron2 doesn't free vram after use, particularly if training crashes. If this is the case, the easiest way to fix the issue in the short term is a reboot.

As for a long term fix to this issue, I cant give any advise other than ensuring your using the latest version of everything.

Fred
  • 1