Which files need to be distributed with an application containing CUDA Driver API?

Question

What are differences in the files that need to be distributed with an application (exe) created using the CUDA Driver API to that of created using the CUDA Runtime API?

Why marking this question to "Close"? How is it OPINION based? This is purely a factual question about the files required to distribute your application based on the usage of CUDA driver API or Runtime API. Kindly at least let me know the reason so that I can provide more information. And, if this question needs to be closed then, how come the following similar (NOT same) question existed? https://stackoverflow.com/questions/27014480/why-should-i-use-the-cuda-driver-api-instead-of-cuda-runtime-api — skm, Nov 10 '21 at 14:59

Robert Crovella · Accepted Answer · 2021-11-10T16:29:21.843

Apart from the files that are actually part of your application (i.e. things that you built), there are potentially no differences.

Any machine that you intend your code to run on must have a supported GPU and a proper driver install with a version new enough to support whatever build environment (e.g. CUDA version) you used to build your application.

There is nothing you can do by way of choosing particular redistributable files to affect that.

Therefore the only thing we are left with is your application itself.

A properly built driver API application only requires the above setup I described, apart from whatever files constitute your application itself.

A properly built runtime API application only requires the above setup I described, apart from whatever files constitute your application itself.

A runtime API application built with nvcc by default links statically to the CUDA runtime API library (e.g. libcudart.so on linux, but technically at this point in the discussion the reference would be to libcudart_static.a). If you build your CUDA runtime application and somehow don't link to the static CUDA runtime API library (this is possible with specifying a non-default switch to nvcc or via linking options with e.g. g++ on linux) then you would also have to redistribute the CUDA runtime API library (e.g. libcudart.so and anything it symlinks to). These are valid redistributables.

This discussion doesn't take into account other libraries like CUBLAS, CURAND, and potentially many other libraries. You didn't ask about them, and conceptually it is no different than asking about how to redistribute any other library, like fftw3. Furthermore, with respect to these libraries, there is no difference whatsoever between distributing an application built with the runtime API, and distributing an application built with the driver API. Whether or not you need to redistribute any other libraries will depend primarily on how those libraries are linked to your application (statically or dynamically).

In this discussion I've avoided any discussion of the files that actually constitute your application. There is almost infinite variety here, in that you could create an application that uses any number of dynamically linked libraries of your own creation, that obviously would have to be distributed with your application.

At the other end of the spectrum, it is certainly possible to create runtime applications that require only one file, the "executable".

I personally don't know how to do that with the driver API (except via silly tricks), so it might be reasonable to say that with respect to a driver API application, considering the application files themselves, you might in most cases be distributing at least 2 files. Obviously you could come up with self-extracting archive style wrappers or other means that hide the file structure and make it appear as if your driver API application actually consists of only 1 file.

Thanks for the detailed answer. In the last paragraph, you have mentioned that "usually" at least 2 files need to be distributed for an application built with CUDA Driver APIs. By 2 files, do you mean the `.exe` file and the `.ptx` file? — skm, Nov 11 '21 at 09:31
@skm: It is never necessary (or advisable) to distribute PTX code in any form other than embedded in a library or executable as part of a cubin image — talonmies, Nov 12 '21 at 02:42
@talonmies: Thanks, I am wondering which 2 files did @robert mean that need to be distributed for an application with Driver APIs? Furthermore, in this answer (link below), it has been mentioned that we don't need to distribute `cubin` files for an application created with runtime APIs so, doesn't that mean that we must distribute cuBin files separately for an application created using Driver APIs? — skm, Nov 12 '21 at 08:35
Yes, typically there would be at least one file that is the executable (host code) and at least one file that contained the device code, in one form or another. — Robert Crovella, Nov 12 '21 at 14:27

Which files need to be distributed with an application containing CUDA Driver API?

1 Answers1