9

I understand that Fermi GPUs support prefetching to L1 or L2 cache. However, in the CUDA reference manual I can not find any thing about it.

Dues CUDA allow my kernel code to prefetch specific data to a specific level of cache?

einpoklum
  • 118,144
  • 57
  • 340
  • 684
dalibocai
  • 2,289
  • 5
  • 29
  • 45

1 Answers1

6

Well not at instruction level but detailed information about prefetching in GPUs in here:

Many-Thread Aware Prefetching Mechanisms for GPGPU Applications
(paper in the the ACM symposium on microarchitecture 2010)

You can find instruction reference in nVIDIA's PTX ISA reference document; the relevant instructions are prefetch and prefetchu.

einpoklum
  • 118,144
  • 57
  • 340
  • 684
kerem
  • 2,699
  • 1
  • 32
  • 37
  • 1
    I appreciate the information. It is a pity that CUDA does not provide prefetching instructions. – dalibocai Feb 14 '11 at 02:05
  • Updated the links... but is that paper still relevant these days (i.e. for the Maxwell and Pascal microarchitectures?) – einpoklum Mar 19 '17 at 22:26