I am trying to run train pytorch model
this is my config file batch size, I have tried reducing batch_size to 1 but the same error
local batch_size = 3,
local num_batch_accumulated = 4,
This is the output of nvidia-smi
As we can see only 451 mib is allocated out of 6144 Mib
I have tried following different solutions mentioned in stackoverflow post here but not able to solve this.
How can I fix this strange error: "RuntimeError: CUDA error: out of memory"?