I have a code which goes something like this.
1) Host: Launch Graphics Kernels 2) Host: Launch CUDA Kernels (all async calls) 3) Host: Do a bunch of number crunching on the host 4) Back to step 1
My questions is this. The CUDA API guarantees that the CUDA kernels even if they are async are executed in order of being launched. Does this apply to the rendering ? Lets say I have some rendering related calculations in progress on the GPU. If I launch async CUDA calls, Will they only be executed once the rendering is complete ? Or will these two operations overlap ?
Also, if i call a CUDA device synchronize after step 2, it certainly forces the device to complete CUDA related functions calls. What about rendering ? Does it stall the host until the rendering related operations are complete as well ?