Segmentation Fault C++ Cuda

Question

I am a Java programmer and new to C++ and Cuda. I am getting a segmentation fault doing below:

Input.h class Input {

public:
    const LucyDecimal * sellingPrice; //Ri
    const LucyDecimal qc;

public:
    Input(
            const LucyDecimal * _sellingPrice, 
            const LucyDecimal _qc);

    virtual ~Input();

};

Input.cpp

Input::Input(
        const LucyDecimal * _sellingPrice, //Ri
        const LucyDecimal _qc):sellingPrice(_sellingPrice),qc(_qc)
{};

Input::~Input() {
}

Now in my CudaRun.cu

void mainRun(Input in) {
    Input *deviceIn;
    deviceIn = new Input(NULL, NULL, NULL, NULL, 0.0, NULL,0.0,0.0,NULL,0.0,NULL,0.0);
    //line-a

    printf("Started. Just abt to call cuda \n");
    int size = sizeof(Input);
    cudaMalloc((void**) &deviceIn, size);
    cudaMemcpy(deviceIn, &in, size, cudaMemcpyHostToDevice);

    cudaMalloc((void**) deviceIn->sellingPrice, 4 * sizeof(LucyDecimal));
    //line-b
        ....
}

I get a segmentation fault at line-b. Has it got to do with line-a initialization?

score 3 · Accepted Answer · edited May 23 '17 at 12:02

You don't get to create a pointer to device memory:

cudaMalloc((void**) &deviceIn, size);

And then dereference that pointer in host code:

cudaMalloc((void**) deviceIn->sellingPrice, 4 * sizeof(LucyDecimal));

To actually set the value of the sellingPrice pointer within the deviceIn structure, the compiler must dereference a pointer computed from the base pointer (deviceIn) in order to write the allocated pointer value, and this dereferencing is illegal in host code.

Copying structures containing pointers is called a "deep copy" and it's somewhat tedious.

Instead you need to allocate a separate pointer:

LucyDecimal * sellingPrice_temp;
cudaMalloc((void**) &sellingPrice_temp, 4 * sizeof(LucyDecimal));

And then copy that allocated pointer from host to device, in the appropriate location:

cudaMemcpy(&(deviceIn->sellingPrice), &sellingPrice_temp, sizeof(LucyDecimal *), cudaMemcpyHostToDevice);

Note that finding the address of a particular location in a structure (&(deviceIn->sellingPrice)) is something the compiler can compute, without dereferencing the base pointer (deviceIn).

You will need to use sellingPrice_temp again if you want to copy the data from that embedded pointer region back to the host at some point.

This topic comes up with some frequency, you can find many other examples if you search on e.g. "CUDA copy structure embedded pointer". The methodology is analogous to copying doubly-subscripted (**) dynamically allocated matrices from host to device.

I'd also recommend proper cuda error checking although it won't be very instructive in this case.

The pointer value of `sellingPrice_temp` [is set by](http://stackoverflow.com/questions/12936986/why-does-cudamalloc-use-pointer-to-pointer/12937162#12937162) the `cudaMalloc` operation that I listed in my answer, right after the declaration of `sellingPrice_temp`, just as in host code, a `malloc` operation returns a pointer value. — Robert Crovella, Apr 01 '14 at 11:36
Your this answer helped: http://stackoverflow.com/questions/22156536/cudamalloc-of-a-structure-and-an-element-of-same-structure But in that answer, you are not doing `cudeMemcpy` for structure. SO how does it get the `foo` value — Jatin, Apr 01 '14 at 11:55
The structure is allocated on the device. If you want to copy the structure contents from the host to device, that is fine. It was not necessary to demonstrate that, in that answer. In order to get the `foo` value from host to device, you could copy the structure. However, if you do that, copy the structure, and *then* set up your `sellingPrice` pointer as I describe above, otherwise the structure copy will overwrite the `sellingPrice` pointer. — Robert Crovella, Apr 01 '14 at 12:01

Segmentation Fault C++ Cuda

1 Answers1