The 1st inference works, from the 2nd inference CUDA out of memory

Asked Nov 04 '22 at 22:08

Active Nov 05 '22 at 00:49

Viewed 111 times

I'm trying to work with SegFormer MiT-B4 on google colab. The model is pretty big, yes. But after some training, the first inference works fine. But when I try to make an inference on another image, I get a CUDA out of memory error message.

image = Image.open('/content/images/image1.jpg')
encoding = feature_extractor(image, return_tensors="pt")
pixel_values = encoding.pixel_values.to(device)
# forward pass
outputs = model(pixel_values=pixel_values)

Only thing I change is "image1" to "image2" and I rerun the same code, then at the outputs = model(pixel_values=pixel_values) step the CUDA oom error occurs. But if I try "image2" first after restarting the kernel, it works, then when I rerun the code with "image1" I get the same issue. I tried with different sets of images, and the problem is the same: the first inference runs, the second inference runs into the error.

torch.cuda.empty_cache() doesn't help.

Is there any step I'm missing?

edited Nov 05 '22 at 00:49

talonmies

70,661
34
192
269

asked Nov 04 '22 at 22:08

Michael

You may check similar question. https://stackoverflow.com/q/57858433/4589528 – Sunghyun Jun Nov 05 '22 at 02:44
@SunghyunJun Thank you for your reply! Unfortunately, I had already came across the link you've shared and tried it, and it didn't work. – Michael Nov 06 '22 at 19:37

The 1st inference works, from the 2nd inference CUDA out of memory

0 Answers0