I have one GPU. First, let's assume that the img comes from the camera in real time. And I have 2 trained pytorch models doing different tasks. Each model uses img as the same input. Is there a way to ensure that the GPU inference of model 1 and model 2 are processed simultaneously rather than sequentially?
Note: "Not" Processing one model on multiple GPUs in parallel