2

On the CPU, I am often using 'sub-images' of 2-D images (pitch-linear), which are simply pointing to a certain ROI of the 'master' image. So all modifications to the sub-image in fact change the 'master' image also.

Are there any problems in CUDA with sub-images to 2-D images (pitch-linear) on the device memory ? E.g., can a bind a texture to it or an texture object ? Do the NPP routines work properly ? I ask because of issues like that a certain alignment (of the 'start address' of the buffer) could be required by certain routines.

Note that I am mainly interested in stability issues. I suppose there might be minor performance penalties for these sub-images, but that is not my main concern.

Especially, I would be interested if the alignment restriction for the buffer base address mentioned in 'cudaBindTexture2D' documentation here:

"Since the hardware enforces an alignment requirement on texture base addresses, cudaBindTexture2D() returns in *offset a byte offset that must be applied to texture fetches in order to read from the desired memory."

is also necessary for 'Texture objects' (for CC >= 3.0 GPUs) ?

Mohsen
  • 153
  • 11
user2454869
  • 105
  • 1
  • 11

1 Answers1

2

Any bound texture (whether via Texture Reference or Texture Object API) should satisfy the alignment requirement(s) provided by cudaGetDeviceProperties, in order to have a direct mapping between data coordinates and texture coordinates:

  1. Any bound texture should satisfy the alignment returned via textureAlignment (in bytes). Allocations provided by cudaMalloc and similar will satisfy this (for the starting address of the allocation).
  2. A 2D bound texture should (for each row in the texture) satisfy the alignment returned via texturePitchAlignment. Allocations provided by (for example) cudaMallocPitch will satisfy this.

NPP should work properly with any properly specified ROI.

Note that your document link is quite old. Current docs can be found here.

This question/answer may be of interest as well.

Community
  • 1
  • 1
Robert Crovella
  • 143,785
  • 11
  • 213
  • 257
  • The restriction 2 (regarding pitch) is no problem because my sub-image has of course the same pitch as the master image, and my master images are always created via cudaMallocPitch. But restriction 1 (regarding base address) will not be fullfilled by a sub-image. So as a workaround I will on creating the texture object hand it out a properly aligned base address and store the proper offset (to the 'real' base adress of the sub-image ), divided by the texel size, to add then in the 'tex2d' calls in order to correct for this adjustment of the base address. In fact, this seems to be more or less – user2454869 Sep 13 '14 at 08:36