Modifying CUDA sample causes unsafe access to global memory

Question

I've modified the Sobel Filter sample to implement Non-Max Suppression for Canny filter. However, the following code generates an exception:

unsigned char pix00 = pCannyOriginal[ i-1 + (blockIdx.x-1) * blockDim.x];
unsigned char pix01 = pCannyOriginal[ i+0 + (blockIdx.x-1) * blockDim.x];
unsigned char pix02 = pCannyOriginal[ i+1 + (blockIdx.x-1) * blockDim.x];
unsigned char pix10 = pCannyOriginal[ i-1 + (blockIdx.x+0) * blockDim.x];
unsigned char pix11 = pCannyOriginal[ i+0 + (blockIdx.x+0) * blockDim.x];
unsigned char pix12 = pCannyOriginal[ i+1 + (blockIdx.x+0) * blockDim.x];
unsigned char pix20 = pCannyOriginal[ i-1 + (blockIdx.x+1) * blockDim.x];
unsigned char pix21 = pCannyOriginal[ i+0 + (blockIdx.x+1) * blockDim.x];
unsigned char pix22 = pCannyOriginal[ i+1 + (blockIdx.x+1) * blockDim.x];

I understand that this causes invalid access to memory, but the same set of assignments on the original texture wouldn't generate one. So, does the tex2D function have a mechanism for invalid memory access? And how am I supposed to fix this?

Also as a note, using the original lena.pgm wouldn't generate any exception, but replacing it with something else does. Does the original lena.pgm contain some extra rows and columns or am I missing something here?

Could you provide more details on the modifications that you made to the sample? Have you tested the original program with the other image that you used with your code? — BenC, May 09 '13 at 09:15
OK, decrementing both width and height variables before calling the function does solve my problem for now. @BenC I'll upload the whole code somewhere soon, and yes I've tested the original image. — Soroosh Bateni, May 09 '13 at 09:17
Moreover, you may want to read SO's [CUDA wiki](http://stackoverflow.com/tags/cuda/info). Use and abuse error checking and `cuda-memcheck` to get some more information on invalid memory accesses and the other errors that you may encounter. Here, if `i = 0` / `blockIdx.x = 0`, you do `pCannyOriginal[k]` where `k < 0`. — BenC, May 09 '13 at 09:24
The code is available [here](https://sourceforge.net/p/gpuedgedetector/wiki/Home/) — Soroosh Bateni, May 09 '13 at 20:02

score 1 · Accepted Answer · edited May 23 '17 at 12:21

The original code relies on 2D textures:

    unsigned char pix00 = tex2D(tex, (float) i-1, (float) blockIdx.x-1);
    unsigned char pix01 = tex2D(tex, (float) i+0, (float) blockIdx.x-1);
    unsigned char pix02 = tex2D(tex, (float) i+1, (float) blockIdx.x-1);
    unsigned char pix10 = tex2D(tex, (float) i-1, (float) blockIdx.x+0);
    unsigned char pix11 = tex2D(tex, (float) i+0, (float) blockIdx.x+0);
    unsigned char pix12 = tex2D(tex, (float) i+1, (float) blockIdx.x+0);
    unsigned char pix20 = tex2D(tex, (float) i-1, (float) blockIdx.x+1);
    unsigned char pix21 = tex2D(tex, (float) i+0, (float) blockIdx.x+1);
    unsigned char pix22 = tex2D(tex, (float) i+1, (float) blockIdx.x+1);

However, textures are not simple arrays: they support interpolation (see this post) and some other options such as cudaAddressModeClamp (out of bounds accesses ---> 0 if negative index, last if too great index, cf. this other post).

In your code, if you use the same (x,y) ids with a simple linearized array, you will end up accessing wrong addresses (x < 0 and/or y < 0), unless you take proper precautions.

Do you know any way to take proper precaution? – Soroosh Bateni May 09 '13 at 20:01 — Soroosh Bateni, May 09 '13 at 20:01

Modifying CUDA sample causes unsafe access to global memory

1 Answers1