I am using Yolov3 by Ultralytics (PyTorch) to detect the behavior of cows in a video. The Yolov3 was trained to detect each individual cow in the video. Each image of the cow is cropped using the X and Y coordinates of the bounding box. Each image then goes through another model to determine whether they are sitting or standing. The second model was also trained with our own dataset. The second model uses Tensorflow and it's a very simple InceptionV3 model.
However, whenever I try to load both models, I am getting the following errors
RuntimeError: CUDA out of memory. Tried to allocate 50.00 MiB (GPU 0; 16.00 GiB total capacity; 427.42 MiB already allocated; 7.50 MiB free; 448.00 MiB reserved in total by PyTorch)
If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
If the second model is not loaded then the yolov3 (PyTorch) runs without any issue and it does not even use the whole 16GB of VRAM. Is the yolov3 is reserving the whole VRAM and not leaving anything for the tensorflow-based Inceptionv3? If yes, anyway of forcing torch to keep 2 GB VRAM aside?
Full code output here
>> python detectv2.py --weights best.pt --source outch06_20181022073801_0_10.avi
2022-06-01 16:02:40.975544: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-06-01 16:02:41.342394: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14123 MB memory: -> device: 0, name: Quadro RTX 5000, pci bus id: 0000:65:00.0, compute capability: 7.5
detectv2: weights=['best.pt'], source=outch06_20181022073801_0_10.avi, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs\detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False
Empty DataFrame
Columns: []
Index: []
YOLOv3 2022-5-16 torch 1.11.0 CUDA:0 (Quadro RTX 5000, 16384MiB)
Fusing layers...
Model Summary: 269 layers, 62546518 parameters, 0 gradients
Traceback (most recent call last):
File "detectv2.py", line 462, in <module>
main(opt)
File "detectv2.py", line 457, in main
run(**vars(opt))
File "C:\Users\sourav\Anaconda3\envs\yl37\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "detectv2.py", line 221, in run
model(torch.zeros(1, 3, *imgsz).to(device).type_as(next(model.model.parameters()))) # warmup
File "C:\Users\sourav\Anaconda3\envs\yl37\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\sourav\yolov3-master\models\common.py", line 357, in forward
y = self.model(im) if self.jit else self.model(im, augment=augment, visualize=visualize)
File "C:\Users\sourav\Anaconda3\envs\yl37\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\sourav\yolov3-master\models\yolo.py", line 127, in forward
return self._forward_once(x, profile, visualize) # single-scale inference, train
File "C:\Users\sourav\yolov3-master\models\yolo.py", line 150, in _forward_once
x = m(x) # run
File "C:\Users\sourav\Anaconda3\envs\yl37\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\sourav\yolov3-master\models\common.py", line 48, in forward_fuse
return self.act(self.conv(x))
File "C:\Users\sourav\Anaconda3\envs\yl37\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\sourav\Anaconda3\envs\yl37\lib\site-packages\torch\nn\modules\conv.py", line 447, in forward
return self._conv_forward(input, self.weight, self.bias)
File "C:\Users\sourav\Anaconda3\envs\yl37\lib\site-packages\torch\nn\modules\conv.py", line 444, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: CUDA out of memory. Tried to allocate 50.00 MiB (GPU 0; 16.00 GiB total capacity; 427.42 MiB already allocated; 7.50 MiB free; 448.00 MiB reserved in total by PyTorch)
If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF