Suppose I have a laptop with nvcc
and CUDA Toolkit installed, and a network of 16 PCs with Nvidia GPUs and MPI. The PCs aren't aware of CUDA, they just have regular Nvidia drivers and supporting software.
I'd like to develop an MPI application for this network. The PCs are going to acquire tasks via MPI and use their GPUs to do these tasks. I plan to develop the CUDA part on my laptop, compile it in a static library and later link this static library at a PC using mpicxx
compiler.
However, I can't find any evidence that such deployment is possible. On the contrary, most of the examples of so called separate compilation require CUDA installed for the final step (linking CUDA-aware static library with MPI-aware main program):
$ g++ main.cpp -L. -lgpu -o main -L/usr/local/cuda/lib64 -lcudart
So, is it possible to compile a program or library that uses CUDA and doesn't have any dependencies like installed drivers and CUDA libs?