I'm trying to work with SegFormer MiT-B4 on google colab. The model is pretty big, yes. But after some training, the first inference works fine. But when I try to make an inference on another image, I get a CUDA out of memory error message.
image = Image.open('/content/images/image1.jpg')
encoding = feature_extractor(image, return_tensors="pt")
pixel_values = encoding.pixel_values.to(device)
# forward pass
outputs = model(pixel_values=pixel_values)
Only thing I change is "image1" to "image2" and I rerun the same code, then at the outputs = model(pixel_values=pixel_values)
step the CUDA oom error occurs. But if I try "image2" first after restarting the kernel, it works, then when I rerun the code with "image1" I get the same issue.
I tried with different sets of images, and the problem is the same: the first inference runs, the second inference runs into the error.
torch.cuda.empty_cache() doesn't help.
Is there any step I'm missing?