pytorch OOM error when using a specific batch size and image size

Asked Mar 10 '23 at 02:24

Active Mar 10 '23 at 02:24

Viewed 25 times

I am trying to train the U2PL (https://github.com/Haochen-Wang409/U2PL) method on my customed dataset, and came accross an OOM error when training with train_sup.py with image size = 320x320 and batch size = 4. I am using two GPUs.

"Tried to allocate 8.38 GiB (11.91 GiB total capacity; 1.28 GiB already allocated; 8.38 GiB free; 2.74 GiB reserved in total by PyTorch)"

Weird thing is that when I am training with either smaller or larger batch size, no OOM error. When I am training with larger or smaller image size, no OOM error either.

I am using the official code of U2PL and mixed precision was not used in the code.

I have no idea what's happening here. Would really appreciate some help. Thank you!

asked Mar 10 '23 at 02:24

Sara

pytorch OOM error when using a specific batch size and image size

0 Answers0