If I am launching multiple CUDA kernels at the same context, and there are dependencies between the kernels (output of the first one in an input to a second one etc.), does the control go back to host after each kernel finished its execution? If not, can you please describe briefly how the "kernel enqueue" mechanism works on CUDA cards?
Asked
Active
Viewed 287 times
0
-
Yes, it does. Unless you call kernels asynchronously (with CUDA streams), It will launch first kernel, wait before it is finished, and then launch the second, etc. I am not sure what you meant by "control goes back to host", as long as host always has control (as far as I understand, I am not a good expert). – Mikhail Genkin Feb 18 '15 at 00:44
1 Answers
0
http://on-demand.gputechconf.com/gtc-express/2011/presentations/StreamsAndConcurrencyWebinar.pdf
Look at slide 9 and 10.
With audio: https://developer.nvidia.com/gpu-computing-webinars
look for cuda concurrency & streams.

Christian Sarofeen
- 2,202
- 11
- 18