I'm looking to collect a trace of events that take place at the device level on GPU.
Background / Analogy on CPU:
On a CPU, when a process A is running, it might be interrupted by another user-level process B, system/kernel processes, various kinds of interrupts such as hardware interrupts, network interrupts, hypervisor related interrupts, etc. To measure these, I would ideally have to make a kernel patch which would capture the start and end times of all processes and interrupts in the scheduler and interrupt tray. Make these kernel data structures visible to the user-level, and then read them repeatedly from a user-level program.
I want to do something similar for the GPU. How do I capture the timestamps of these interrupts and background processes? In the literature I saw that nvidia-smi
can be used for gathering timestamp, but I'm very unclear on how to actually instrument the GPU to get what I need.
Can anybody point out references or tell me how to instrument the GPU to get timestamps? Or specifically, use nvprof
, cuda-memcheck
for the same purpose?