How to work with Eigen in CUDA kernels

Question

Eigen is a c++ linear algebra library http://eigen.tuxfamily.org.

It's easy to work with basic data types, like basic float arrays, and just copy it to device memory and pass the pointer to cuda kernels. But Eigen matrix are complex type so how to copy it to device memory and let cuda kernels read/write with it?

It is a legacy project that heavily relys on Eigen so it's better not to replace it — Mickey Shine, May 22 '14 at 09:06
Easier way would probably be to switch to CUBLAS right before going on the device, if eigen isn't designed to work on a GPU you can't use it (or you'd get horrible errors / performances). Also take a look at unified memory, might save you some hassle copying stuff (or if you want total control do it yourself) — Marco A., May 22 '14 at 09:10
Is there a way to get raw data pointer ,like float *, from eigen? — Mickey Shine, May 22 '14 at 09:12
What do you mean by "Eigen matrix are complex type"? Be ware that complex type can be `std::complex` in this context. You can have real matrices in eigen... Your question is chaotic: "*It's easy to work with basic data types, like basic float arrays, and just copy it to device memory and pass the pointer to cuda kernels.*", you mean Eigen is easy to work with plain types, or CUDA? — luk32, May 22 '14 at 09:20
I have to agree with luk32, if you don't have any idea on how to get the raw data from eigen (http://eigen.tuxfamily.org/dox/classEigen_1_1PlainObjectBase.html#a4663159a1450fa89214b1ab71f7ef5bf probably, be careful with row-column major), then this is likely not going to work. CUDA is a different programming model and will likely need some work to get it right if you don't want to use CUBLAS or code your own kernel routines — Marco A., May 22 '14 at 09:22
@MarcoA. Well there is a way, you can make Eigen work on plain arrays. Getting raw data would not help much, because the memory layout is probably different from built-in types which, I think, CUDA is suited to work with. It would probably be a hassle to deal with it inside of a CUDA kernel. — luk32, May 22 '14 at 09:33
@luk32 it shouldn't be hard, but if the user has no experience of eigen and CUDA, it will require some time to get it right. I'm just trying to assess his current skills to suggest a way — Marco A., May 22 '14 at 09:35
I'm not an Eigen user, so take what I'm saying with a grain of salt. But if you can [Convert Eigen Matrix to C array](http://stackoverflow.com/questions/8443102/convert-eigen-matrix-to-c-array), and in particular to a `complex` array, then it should be possible to copy the raw data to a `cuComplex` vector. What prevents you to do this? — Vitality, May 22 '14 at 09:35

score 26 · Answer 1 · edited May 23 '17 at 11:54

Since November 2016 (Release of Eigen 3.3), a new option exists: Using Eigen directly inside CUDA kernels - see this question.

Example from the linked question:

__global__ void cu_dot(Eigen::Vector3f *v1, Eigen::Vector3f *v2, double *out, size_t N)
{
    int idx = blockIdx.x * blockDim.x + threadIdx.x;
    if(idx < N)
    {
        out[idx] = v1[idx].dot(v2[idx]);
    }
    return;
}

Copying an array of Eigen::Vector3f to device:

Eigen::Vector3f *host_vectors = new Eigen::Vector3f[N];
Eigen::Vector3f *dev_vectors;
cudaMalloc((void **)&dev_vectors, sizeof(Eigen::Vector3f)*N)
cudaMemcpy(dev_vectors, host_vectors, sizeof(Eigen::Vector3f)*N, cudaMemcpyHostToDevice)

score 11 · Accepted Answer · answered May 22 '14 at 10:55

11

If all you want is to access the data of an Eigen::Matrix via a raw C pointer, then you can use the .data() function. Coefficient are stored sequentially in memory in a column major order by default, or row major if you asked for:

MatrixXd A(10,10);
double *A_data = A.data();

answered May 22 '14 at 10:55

ggael

28,425
2
65
71

2

@MickeyShine: note that this is similar to what you can do when you're using CUDA with STL vectors containing POD structures. As ggael said, just be careful with the default storage order. – BenC May 22 '14 at 11:14

score 5 · Answer 3 · edited Mar 22 '18 at 07:03

5

Apart from rewriting and refitting the code, there is an Eigen-compatible library written as a byproduct of a research project that performs matrix calculations on the GPU, and you can use multiple backends: https://github.com/rudaoshi/gpumatrix

I cannot vouch for it, but if it works it is probably exactly what you are looking for.

If you'd like a more general-purpose solution, this thread seems to contain very useful info

edited Mar 22 '18 at 07:03

Janne Karila

24,266
6
53
94

answered May 22 '14 at 09:34

rubenvb

74,642
33
187
332

Seems the easiest way to go, this or plain C arrays. +1 by me – Marco A. May 22 '14 at 09:35

score 3 · Answer 4 · answered May 22 '14 at 09:30

There are two ways.

Make eigen work on GPU, which probably is hard and won't perform good. At least if work on GPU means only get it to compile and produce results. Eigen is practically hand optimized for modern CPUs. Internally Eigen uses its own allocators and memory layouts which are most probably not going to work well on CUDA.

2nd way is easier to do and should not break legacy Eigen code, and probaly is the only suitable in your case. Switch your underlying matrices to plain matrices (i.e. double**) use Eigen::Map. This way, you will have Eigen interface to plain data-type so the codes should not break, and you can send the matrix to the GPU as normal c-array, like its usually done. The drawback is that probably you won't utilize Eigen to the full potential, however if you offload most of work to GPU it's ok.

It's actually reversing things a bit. Instead of making Eigen arrays work on CUDA, you can make Eigen work on normal arrays.

The 2nd might be a way to. Thank you all – Mickey Shine May 22 '14 at 09:35 — Mickey Shine, May 22 '14 at 09:35

How to work with Eigen in CUDA kernels

4 Answers4

Linked