0

When compiling CUDA code with the NVCC compiler you typically (AFAIK) need to specify the PTX architecture with -arch=compute_35 or -arch=sm_35 but there are multiple architectures. I would like to make my code as portable as possible so that a configure script could detect what architecture to use and pass the argument to the NVCC compiler.

This similar question answers where I could find it online but I would prefer to have a script so a downstream user could just install my program without needing to know the specific architecture of their GPU.

Community
  • 1
  • 1
cdeterman
  • 19,630
  • 7
  • 76
  • 100
  • When do you want to detect which CUDA architecture? Do you want to use the detected architecture for a build which is customized for the hardware it is currently executing on? Are you aware that you can compile for multiple architectures? – m.s. Jun 08 '15 at 16:08
  • I was not aware you could compile for multiple architectures. Would it be efficient to just compile for all possible architecture? I'm trying to find a balance between absolute performance and portability. – cdeterman Jun 08 '15 at 16:10
  • see [here](http://stackoverflow.com/questions/17599189/what-is-the-purpose-of-using-multiple-arch-flags-in-nvidias-nvcc-compiler): " An executable can contain multiple versions of SASS and/or PTX, and there is a runtime loader mechanism that will pick appropriate versions based on the GPU actually being used." – m.s. Jun 08 '15 at 16:11
  • 1
    Thanks, that question does appear to answer my own. It appears the only real drawback is some compile time. I will mark my question as a duplicate as it asks a slightly different question than the other (although the answer is provided). Thank you. – cdeterman Jun 08 '15 at 16:18

0 Answers0