0

According to CUDA streams not overlapping , "the profiler will serialize streaming to get accurate timing data". Now the question is, is there anyway to avoid this serialization behavior in cuda profiling (say nvvp)? I am using Fermin M2090 and cuda-4.0.

Community
  • 1
  • 1
Hailiang Zhang
  • 17,604
  • 23
  • 71
  • 117
  • You could always check the Nvidia site for the latest version of CUDA and it's document, as well as the new features it provides. – kangshiyin Jan 23 '13 at 02:33

1 Answers1

4

The Visual Profiler 5.0 (including nvprof and CUPTI) and Nsight Visual Studio Edition 2.0 and greater (>2 years old) support concurrent kernel trace for Fermi and Kepler devices.

Greg Smith
  • 11,007
  • 2
  • 36
  • 37
Eugene
  • 9,242
  • 2
  • 30
  • 29