currently I have a C++ source code src.cpp
:
#include <Library.h> // This is a GPU library based on CUDA
// All 3 function invoke CUDA kernels
// While I don't know the exactly mapping between function and CUDA kernels
Library::func1();
Library::func2();
Library::func3();
Now I only want to profile the CUDA kernel invoked by func3()
, and ignore the kernels invoked by other functions. How can I do that?
It should be noted that src.cpp
is purely C++ and is compiled using gcc
, not nvcc
.