1

I wrote this code of a CUDA Kernel assign() using the class device_vector for initializing a vector. This kernel is launched by a class member function as a solution to the question:

CUDA kernel as member function of a class

and according to

https://devtalk.nvidia.com/default/topic/573289/mixing-c-and-cuda/.

I'm using a GTX650Ti GPU, Windows 8.1, Visual Studio 2013 Community and CUDA Toolkit 7.5.

The code initTest.cu does compile but an exception is thrown making reference to the file trivial_copy.inl.

"First-chance exception at 0x775B5B68 in initTest.exe: Microsoft C++ exception: thrust::system::system_error at memory location 0x0116F3C8. If there is a handler for this exception, the program may be safely continued."

Does anyone know why this problem occurs?

The header file foo.cuh is:

#ifndef FOO_CUH
#define FOO_CUH
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
#include <thrust/device_vector.h>
#include <vector>
using namespace thrust;
using namespace std;

__global__ void assign(float *x, const float &constant, const unsigned int &n)
{
    int i = blockDim.x * blockIdx.x + threadIdx.x;
    if (i < n)
        x[i] = constant;
}
class foo
{
    public:
    foo(const unsigned int &);
    void init(const float &);
    vector<float> domain;
private:
    unsigned int samples;
};
foo::foo(const unsigned int &n)
{
    vector<float> result(n);
    domain = result;
    samples = n;
}
void foo::init(const float &value)
{
    device_vector<float> result(samples);
    assign <<< 1, domain.size() >>>(raw_pointer_cast(result.data()), value, samples);
    thrust::copy(result.begin(), result.end(), domain.begin());
}
#endif

and the main function defined in initTest.cu is:

#include "foo.cuh"
#include <iostream>

int main()
{
    foo a(10);
    a.init(0.5);
    for (unsigned int i = 0; i < a.domain.size(); i++)
    {
        if (i == 0)
            cout << "{ ";
        else if (i == a.domain.size() - 1)
            cout << a.domain[i] << " }";
        else
            cout << a.domain[i] << ", ";
    }
    cin.get();
    return 0;
}
Community
  • 1
  • 1
Vitrion
  • 405
  • 5
  • 14
  • 1
    "but when I integrate it in a longer code" Sorry, but you're going to have to give some details about what that means exactly. There is nothing wrong with the code you have shown here from a compile perspective. How **exactly** are you integrating it in a longer code? Are you trying to `#include` this file in a `.cpp` file, perhaps? (That question that you refer to as "unsolved" **is** solved, by the way. The solution presented there is correct.) – Robert Crovella Nov 04 '15 at 01:56
  • Yes you're right. I haven't tried a separate compilation. When I say "longer code" I mean that this code is a little part of another code. As I said, this code works perfectly and my question is because when I integrate it to that code, this error appears. So, this is the same topic of having a class with a method that launches a kernel like this. I don't understand why when is separated it compiles and integrated it doesn't. – Vitrion Nov 04 '15 at 14:22
  • Sorry I'm not following you. I would suggest providing a short, complete example of something that **doesn't work**. Then it's likely that someone could offer suggestions. Right now your question simply contains an example that works fine. I don't think that's useful. – Robert Crovella Nov 04 '15 at 14:42
  • I hope this example is clearer. I modified the program and now the problem is different. I'll be honest with you, I change the code expecting to find the same errors but I always find different errors, which is very confusing to me. Sorry for my English – Vitrion Nov 04 '15 at 16:18
  • This is a completely different question now. The title of the question now does not correspond to what you are asking. – Robert Crovella Nov 04 '15 at 16:30

1 Answers1

1

This is illegal:

__global__ void assign(float *x, const float &constant, const unsigned int &n)
                                             ^                             ^

Kernel parameters cannot be pass-by-reference.

When I remove the ampersands:

__global__ void assign(float *x, const float constant, const unsigned int n)

Your code runs correctly for me.

I would suggest you use proper cuda error checking. Doing so would have focused your attention on the kernel. Instead, the error was uncaught until thrust detected it and threw a system_error, which doesn't help to identify the source of the error.

Community
  • 1
  • 1
Robert Crovella
  • 143,785
  • 11
  • 213
  • 257
  • Thank you very much. I always use the CUDA error checking but I erased the instructions to keep the example short. I does work and probably this is the solution for the rest of my code. I'll try it. – Vitrion Nov 04 '15 at 17:01