I want to have a 3d float array in CUDA, here is my code:
#define SIZE_X 128 //numbers in elements
#define SIZE_Y 128
#define SIZE_Z 128
typedef float VolumeType;
cudaExtent volumeSize = make_cudaExtent(SIZE_X, SIZE_Y, SIZE_Z); //The first argument should be SIZE_X*sizeof(VolumeType)??
float *d_volumeMem;
cutilSafeCall(cudaMalloc((void**)&d_volumeMem, SIZE_X*SIZE_Y*SIZE_Z*sizeof(float)));
.....//assign value to d_volumeMem in GPU
cudaArray *d_volumeArray = 0;
cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc<VolumeType>();
cutilSafeCall( cudaMalloc3DArray(&d_volumeArray, &channelDesc, volumeSize) );
cudaMemcpy3DParms copyParams = {0};
copyParams.srcPtr = make_cudaPitchedPtr((void*)d_volumeMem, SIZE_X*sizeof(VolumeType), SIZE_X, SIZE_Y); //
copyParams.dstArray = d_volumeArray;
copyParams.extent = volumeSize;
copyParams.kin = cudaMemcpyDeviceToDevice;
cutilSafeCall( cudaMemcpy3D(©Params) );
Actually, my program runs well. But I'm not sure the result is right. Here is my problem, in the CUDA liberay, it said that the first parameter of make_cudaExtent is "Width in bytes" and the other two is height and depth in elements. So I think in my code above, the fifth line should be
cudaExtent volumeSize = make_cudaExtent(SIZE_X*sizeof(VolumeType), SIZE_Y, SIZE_Z);
But in this way, there would be error "invalid argument" in cutilSafeCall( cudaMemcpy3D(©Params) ); Why?
And another puzzle is the strcut cudaExtent, as CUDA library stated,its component width stands for "Width in elements when referring to array memory, in bytes when referring to linear memory". So I think in my code when I refer volumeSize.width it should be number in elements. However, if I use
cudaExtent volumeSize = make_cudaExtent(SIZE_X*sizeof(VolumeType), SIZE_Y, SIZE_Z);
The volumeSize.width would be SIZE_X*sizeof(VolumeType)(128*4), that is number in bytes instead of number in elements.
In many CUDA SDK, they use char as the VolumeType, so they just use SIZE_X as the first argument in make_cudaExtent. But mine is float, so, anyone could tell me which is the right way to create a cudaExtent if I need to use this to create a 3D array?? Thanks a lot!