CUDA_OUT_OF_MEMORY in PyTorch head2head model

Question

I am executing the head2head model presented in the Github repo here. When I am running the code using the following command:

./scripts/train/train_on_target.sh Obama head2headDataset

with contents of the train_on_target.sh file as:

target_name=$1
dataset_name=$2

python train.py --checkpoints_dir checkpoints/$dataset_name \
                --target_name $target_name \
                --name head2head_$target_name \
                --dataroot datasets/$dataset_name/dataset \
                --serial_batches

Then I am getting the following error:

Traceback (most recent call last):
  File "train.py", line 108, in <module>
    flow_ref, conf_ref, t_scales, n_frames_D)
  File "/home/nitin/head2head/util/util.py", line 48, in get_skipped_flows
    flow_ref_skipped[s], conf_ref_skipped[s] = flowNet(real_B[s][:,1:], real_B[s][:,:-1])
  File "/home/nitin/anaconda3/envs/head2head/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nitin/anaconda3/envs/head2head/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/nitin/anaconda3/envs/head2head/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nitin/head2head/models/flownet.py", line 38, in forward
    flow, conf = self.compute_flow_and_conf(input_A, input_B)
  File "/home/nitin/head2head/models/flownet.py", line 55, in compute_flow_and_conf
    flow1 = self.flowNet(data1)
  File "/home/nitin/anaconda3/envs/head2head/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nitin/head2head/models/flownet2_pytorch/models.py", line 156, in forward
    flownetfusion_flow = self.flownetfusion(concat3)
  File "/home/nitin/anaconda3/envs/head2head/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nitin/head2head/models/flownet2_pytorch/networks/FlowNetFusion.py", line 62, in forward
    concat0 = torch.cat((out_conv0,out_deconv0,flow1_up),1)
RuntimeError: CUDA out of memory. Tried to allocate 82.00 MiB (GPU 0; 5.80 GiB total capacity; 4.77 GiB already allocated; 73.56 MiB free; 4.88 GiB reserved in total by PyTorch)

I have checked the batch size in the file options/base_options.py. It is already set to 1. How can I solve the above mentioned exception. My system has 6 GB NVIDIA GTX 1660 Super GPU.

score 1 · Accepted Answer · answered Mar 05 '21 at 12:17

Data management:

You can try reducing the dataset used for training to check if is a hardware limitation. Moreover, if it is an image dataset, you can reduce the dimensions of the images by reducing the dpi.

Model parameters management:

Another approach is to reduce the number of parameters of your model. The first suggestion would be to change the Dense layer size and then the other neural network hyperparameters.

CUDA_OUT_OF_MEMORY in PyTorch head2head model

1 Answers1