0

I've got an existing WinForms application on .NET 4.0 CP framework and I want it to do computations on CUDA devices. For this I decided to use CUDAfy.NET wrapper around C/C++ Toolkit, because it is (as far as I know) the only one up-to-date with CUDA SDK. The development is going without any major problems on my machine, but I've encountered troubles when deploying to another device.

More specifically, when I build the project in VS and then run it on my machine, it runs fine. But the odd thing is that it runs nvcc.exe when initializing CUDAfy modules, which is a part of CUDA SDK and shouldn't be required there. And when I try to run the binary on any target machine it throws this exception:

Cannot find compiler cl.exe in path.

This is an error connected to missing VS tool for C++ compiler, and it shouln't arise on target devices. And now comes the weirdest thing; when I build the sample project that comes with CUDAfy.NET and try to run it on the target device, it throws the same exception.

There is nothing wrong with the target machine, according to CUDAfy.NET test app Cudafy Viewer it is compatible and it has CUDA capability. Besides I've tested it on several different devices, always with the same result. I've traced the origin of the exception and as I indicated it is thrown when initializing CUDAfy.NET:

CudafyModule module = CudafyTranslator.Cudafy();
GPGPU _gpu = CudafyHost.GetDevice(eGPUType.Cuda);
_gpu.LoadModule(module);

According to CUDAfy.NET User Manual it should run perfectly fine on devices that meet these requirements:

  • Windows 64-bit
  • .NET 4.0
  • NVIDIA GPU with compute capability 2.0 or higher
  • Up to date NVIDIA drivers
  • CURAND, CUSPARSE, CUFFT and CUBLAS dlls if using these math libraries
  • Precompiled CUDAfy modules

All of these are satisfied but it still doesn't run. That leaves me with a problem on my side and I'm pretty stuck there.

One of the possibilities is that it is caused by wrong compilation of the code that is intended to be cudafied. According to the manual, and I quote, "You generally would not cudafy your .NET code in a deployment situation as this requires the full CUDA SDK and Visual Studio. CUDAfy modules can be loose at .cdfy files or embedded in your application assembly (.exe or .dll) through use of the cudaycl command line tool.". This should be done automatically, nonetheless I've tried using the cudaycl, ufortunatelly with no improvement. But since the exception occurs when initializing the CUDAfy, I think that source of the problem is elsewhere.

What else might be causing it is that I build the binary for a specific architecture (e.g. CUDA 2.0) and then deploying it to another (e.g. CUDA 3.0). Something about it is mentioned in CUDA Toolkit Documentation in the section about nvcc compiler: "Binary code is architecture-specific. A cubin object is generated using the compiler option -code that specifies the targeted architecture: For example, compiling with -code=sm_35 produces binary code for devices of compute capability 3.5.".

One way or another, I can't make it work right now. I would appreciate any help and suggestions you have. By the way I'm using latest CUDAfy.NET v1.29 and CUDA Toolkit 7.0 (the latest is not yet supported by CUDAfy.NET).

talonmies
  • 70,661
  • 34
  • 192
  • 269
SysGen
  • 604
  • 7
  • 29
  • From the look of it `CudafyTranslator.Cudafy()` requires a working toolchain by default. You probably need to find a way of initialising an empty module which doesn't invoke their JIT compilation system. Is there a straight driver API wrapper in CUDAfy.NET? – talonmies Oct 02 '15 at 07:00
  • @talonmies I'm not aware of any straight driver API wrapper, I don't think there is one in CUDAfy. But it has to be resolved somehow, they mention running their cudafied .NET programs on target machines. – SysGen Oct 02 '15 at 18:47
  • also, this is a duplicate of http://stackoverflow.com/questions/29430655/avoiding-nvcc-compilation-when-using-cudafy – Val Nov 11 '15 at 11:13

1 Answers1

0

In the CUDAfy_User_Manual_1_22.pdf there is a chapter dedicated specifically to that. It's "5.2 Caching Modules to Improve Performance".

public class ArrayBasicIndexing
{
  CudafyModule km = CudafyModule.TryDeserialize();
  if (km == null || !km.TryVerifyChecksums())
  {
    km = CudafyTranslator.Cudafy();
    km.Serialize();

The code will check if there already is a compiled CUDAfy module and will compile a new one ONLY if there is no existing module (or it's out of date). So, your application will generate the modules on your dev machine and then you can distribute the app with the modules to other machines. Those client machines will not try to generate new modules anymore because the app has not changed.

If you've changed the app, you will have to run it (so it can re-generate the modules) and then redistribute modules with the new version of the app.

Val
  • 1,548
  • 1
  • 20
  • 36