2

My GPU kernel reads data from different input buffers. I want to check whether I manage to get cache hits for the reads from one of these buffers. Is it possible to limit the counting of cache hit/miss metrics to a particular range of memory addresses (of course, while running the kernel as-is with all of the other reads we don't care about)?

einpoklum
  • 118,144
  • 57
  • 340
  • 684
  • Does checking the e.g. "L2 Theoretical Sectors Global Excessive" source counter work for this? For L1 I'm not sure, because "L1 Wavefronts Shared Excessive" is only about the shared memory part of L1 I think. – paleonix Mar 01 '23 at 14:45
  • no its not possible. And you can certainly get more authoritative responses by posting on the NVIDIA nsight compute forum. – Robert Crovella Mar 01 '23 at 17:52
  • @RobertCrovella: I'm actually always wondering whether to post there or here. I suppose - although I'm not even sure - that more NVIDIA employees see posts over there. On the other hand, the traffic on that forum is very low, and I'm not sure it's indexed very well by web searches so when there _is_ an answer available, I'd much rather it be available here. – einpoklum Mar 01 '23 at 22:43
  • @paleonix: Hmm, let me have another look at those. – einpoklum Mar 01 '23 at 22:44

0 Answers0