25

I'm asking this because I know there's a way to use binary files instead of source files.

Also, I'm guessing that with an assembly language, it would be easier to simulate function pointers. Unless the assembly on a GPU is totally different from the one on a CPU.

widgg
  • 1,358
  • 2
  • 16
  • 35
  • what's wrong with C for CUDA? http://developer.download.nvidia.com/compute/cuda/2_1/toolkit/docs/NVIDIA_CUDA_Programming_Guide_2.1.pdf –  Sep 08 '11 at 19:05
  • 5
    Answered a few weeks ago [in an answer to "Is it possible to put instructions into CUDA code?"](http://stackoverflow.com/questions/3677220/is-it-possible-to-put-instructions-into-cuda-code/7072079#7072079). *Note:* not the accepted answer, but one that came later. – dmckee --- ex-moderator kitten Sep 08 '11 at 19:09
  • @dmckee +1. Also, since CUDA 3.2 (and 2.0-devices) function pointers are supported without meddling with ptx. And older devices simply has no such thing as device-function - all calls to `__device__` functions from kernel were inlined. – aland Sep 08 '11 at 19:21
  • Possible duplicate of [How to create or manipulate GPU assembler?](http://stackoverflow.com/questions/4660974/how-to-create-or-manipulate-gpu-assembler) – Ciro Santilli OurBigBook.com Mar 16 '16 at 20:14

3 Answers3

35

You might want to take a look at PTX. NVIDIA provides a document describing it in the CUDA 4.0 documentation.

http://developer.nvidia.com/nvidia-gpu-computing-documentation

NVIDIA describes PTX as "Ta low-level parallel thread execution virtual machine and instruction set architecture (ISA). PTX exposes the GPU as a data-parallel computing device." Not exactly like x86 assembly, but you might find it interesting reading.

Patrick87
  • 27,682
  • 3
  • 38
  • 73
  • 17
    It's worth pointing out that PTX is a virtual instruction set. Each distinct NVIDIA architecture has its own physical ISA which PTX targets. One can inline PTX instructions into CUDA code similarly to inline x86 asm. – Jared Hoberock Sep 08 '11 at 20:53
21

There are in fact two different CUDA assembly languages.

PTX is a machine-independent assembly language that is compiled down to SASS, the actual opcodes executed on a particular GPU family. If you build .cubins, you're dealing with SASS. Most CUDA runtime applications use PTX, since this enables them to run on GPUs released after the original application.

Also, function pointers have been in CUDA for a while if you're targeting sm_20 (Fermi/GTX 400 series).

ChrisV
  • 3,363
  • 16
  • 19
17

Yes, the assembly on a GPU is totally different from that of a CPU. One of the differences is that the instruction set for a GPU is not standardized. NVidia (and AMD and other GPU vendors) can and do change their instruction set from one GPU model to the next.

So CUDA does not expose an assembly language. There'd be no point. (And the limitations in CUDA's C dialect, and whatever other languages they support, are there because of limitations in the GPU hardware, not just because Nvidia hates you and wants to annoy you. So even if you had direct access to the underlying instruction set and assembly language, you wouldn't be able to magically do things you can't do now.

(Note that there's NVidia does define a "virtual" instruction set that you can use and embed in your code. But it's not the instruction set, and it doesn't map directly to the hardware instructions. It's little more than a simpler programming language which "looks like" a dialect of assembly

Lie Ryan
  • 62,238
  • 13
  • 100
  • 144
jalf
  • 243,077
  • 51
  • 345
  • 550
  • 1
    oh! good to know... if I cannot even expect that code to work on different GPU... it's definitely the wrong approach! Thanks – widgg Sep 09 '11 at 01:16
  • 1
    Your CUDA code will work fine across different GPUs. CUDA just compiles it to a suitable target for each GPU. – jalf Sep 09 '11 at 08:36