I have a HostMatrix which was declared as:
float **HostMatrix
I have to copy the content of device matrix , pointed to by devicePointer
to the 2 dimensional host matrix HostMatrix
I tried this
for (int i=0; i<numberOfRows; i++){
cudaMemcpy(HostMatrix[i], devicePointer, numberOfColumns *sizeof(float),
cudaMemcpyDeviceToHost);
devicePointer += numberOfColumns;// so as to reach next row
}
But this will be wrong since I am doing this inside a host function, and devicePointer can not be manipulated directly in host function as I am doing in last line.
So what will be the correct way to achieve this ?
Edit
Oh actually this will work correctly!. But the problem would come while de-allocating the memory as discussed in my earlier question: CUDA: Invalid Device Pointer error when reallocating memory . So basically the following will be incorrect
for (int i=0; i<numberOfRows; i++){
cudaMemcpy(HostMatrix[i], devicePointer, numberOfColumns *sizeof(float),
cudaMemcpyDeviceToHost);
devicePointer += numberOfColumns;// so as to reach next row
}
cudaFree(devicePointer); //invalid device pointer