0

I've modified the Sobel Filter sample to implement Non-Max Suppression for Canny filter. However, the following code generates an exception:

unsigned char pix00 = pCannyOriginal[ i-1 + (blockIdx.x-1) * blockDim.x];
unsigned char pix01 = pCannyOriginal[ i+0 + (blockIdx.x-1) * blockDim.x];
unsigned char pix02 = pCannyOriginal[ i+1 + (blockIdx.x-1) * blockDim.x];
unsigned char pix10 = pCannyOriginal[ i-1 + (blockIdx.x+0) * blockDim.x];
unsigned char pix11 = pCannyOriginal[ i+0 + (blockIdx.x+0) * blockDim.x];
unsigned char pix12 = pCannyOriginal[ i+1 + (blockIdx.x+0) * blockDim.x];
unsigned char pix20 = pCannyOriginal[ i-1 + (blockIdx.x+1) * blockDim.x];
unsigned char pix21 = pCannyOriginal[ i+0 + (blockIdx.x+1) * blockDim.x];
unsigned char pix22 = pCannyOriginal[ i+1 + (blockIdx.x+1) * blockDim.x];

I understand that this causes invalid access to memory, but the same set of assignments on the original texture wouldn't generate one. So, does the tex2D function have a mechanism for invalid memory access? And how am I supposed to fix this?

Also as a note, using the original lena.pgm wouldn't generate any exception, but replacing it with something else does. Does the original lena.pgm contain some extra rows and columns or am I missing something here?

talonmies
  • 70,661
  • 34
  • 192
  • 269
Soroosh Bateni
  • 897
  • 9
  • 20
  • Could you provide more details on the modifications that you made to the sample? Have you tested the original program with the other image that you used with your code? – BenC May 09 '13 at 09:15
  • OK, decrementing both width and height variables before calling the function does solve my problem for now. @BenC I'll upload the whole code somewhere soon, and yes I've tested the original image. – Soroosh Bateni May 09 '13 at 09:17
  • Moreover, you may want to read SO's [CUDA wiki](http://stackoverflow.com/tags/cuda/info). Use and abuse error checking and `cuda-memcheck` to get some more information on invalid memory accesses and the other errors that you may encounter. Here, if `i = 0` / `blockIdx.x = 0`, you do `pCannyOriginal[k]` where `k < 0`. – BenC May 09 '13 at 09:24
  • The code is available [here](https://sourceforge.net/p/gpuedgedetector/wiki/Home/) – Soroosh Bateni May 09 '13 at 20:02

1 Answers1

1

The original code relies on 2D textures:

    unsigned char pix00 = tex2D(tex, (float) i-1, (float) blockIdx.x-1);
    unsigned char pix01 = tex2D(tex, (float) i+0, (float) blockIdx.x-1);
    unsigned char pix02 = tex2D(tex, (float) i+1, (float) blockIdx.x-1);
    unsigned char pix10 = tex2D(tex, (float) i-1, (float) blockIdx.x+0);
    unsigned char pix11 = tex2D(tex, (float) i+0, (float) blockIdx.x+0);
    unsigned char pix12 = tex2D(tex, (float) i+1, (float) blockIdx.x+0);
    unsigned char pix20 = tex2D(tex, (float) i-1, (float) blockIdx.x+1);
    unsigned char pix21 = tex2D(tex, (float) i+0, (float) blockIdx.x+1);
    unsigned char pix22 = tex2D(tex, (float) i+1, (float) blockIdx.x+1);

However, textures are not simple arrays: they support interpolation (see this post) and some other options such as cudaAddressModeClamp (out of bounds accesses ---> 0 if negative index, last if too great index, cf. this other post).

In your code, if you use the same (x,y) ids with a simple linearized array, you will end up accessing wrong addresses (x < 0 and/or y < 0), unless you take proper precautions.

Community
  • 1
  • 1
BenC
  • 8,729
  • 3
  • 49
  • 68