2

I have been profiling an application with nvprof and nvvp (5.5) in order to optimize it. However, I get totally different results for some metrics/events like inst_replay_overhead, ipc or branch_efficiency, etc. when I'm profiling the debug (-G) and release version of the code.

so my question is: which version should I profile? The release or debug version? Or the choice depends upon what I'm looking for?

I found CUDA - Visual Profiler and Control Flow Divergence where is stated that a debug (-G) version is needed to properly measure the divergent branches metric, but I am not sure about other metrics.

Community
  • 1
  • 1
ScHuMi
  • 23
  • 3
  • 1
    I don't see anything in the link you provided that says that -G is needed to properly measure the divergent branches metric. That specific profiler feature being referred to (back-referencing to source) can be accomplished with either a release or debug version, as spelled out in the answer provided there. – Robert Crovella Jan 13 '15 at 22:32
  • Robert Crovella, you are correct. The source in the link gives two options, and i do not mention that. Thank you. – ScHuMi Jan 14 '15 at 20:23

1 Answers1

5

Profiling usually implies that you care about performance.

If you care about performance, you should profile the release version of a CUDA code.

The debug version (-G) will generate different code, which usually runs slower. There's little point in doing performance analysis (including execution time measurement, benchmarking, profiling, etc.) on a debug version of a CUDA code, in my opinion, for this reason.

The -G switch turns off most optimizations that the device code compiler might ordinarily make, which has a large effect on code generation and also often a large effect on performance. The reason for the disabling of optimizations is to facilitate debug of code, which is the primary reason for the -G switch and for a debug version of your code.

Robert Crovella
  • 143,785
  • 11
  • 213
  • 257
  • 1
    In general you want to do a full release build. If you want to use source correlated experiments add -lineinfo. If you need to look at the logical control flow of the application -G can sometimes be more useful than -lineinfo. When using -G avoid looking at any other metrics. – Greg Smith Jan 14 '15 at 14:12