After searching multiple different solutions, and trying to collect and test the available possibilities, I finally reached to this simple method.
The original C++ library can be made using gcc in one step like this answer.
gcc -shared -o dll.so -fPIC dllmain.cpp
but make sure to add extern "C"
before the required function(s) inside .cpp file, like this:
#include <stdio.h>
extern "C" void func()
{
// code
}
For CUDA C++, nvcc can be used in the same way similarly to this answer and this answer combined. Make sure to use .so instead of .dll and use the proper device architecture, I used 60 here as I am using "Tesla P100-PCIE-16GB".
nvcc -arch=sm_60 --compiler-options '-fPIC' -o dll.so --shared kernel.cu
The .cu file will be similar to this.
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
#include <stdio.h>
extern "C" void myfunc(int a, int b, ...);
__global__ void kernel(int a, int b, ...);
__global__ void kernel(int a, int b, ...)
{
int i = threadIdx.x;
// kernel code
}
void myfunc(int a, int b, ...)
{
// code
}
Now the dynamic library .so is created and can be used inside C# code like this.
using System;
using System.Runtime.InteropServices;
class Program
{
[DllImport("dll.so")]
static extern myfunc(int a, int b, ...);
private void Method()
{
int a, b;
// code
myfunc(a, b, ...);
}
}
The C# code then is compiled using Mono.
mcs Program.cs
mono Program.exe
But it will probably be necessary to set the path of the used library like this.
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/library/
This worked for a simple CUDA C++ code, it will likely work for other ones, but some problems may arise depending on their complexity.