3

I am attempting to parallelize a program that does some image processing with openACC. As a part of this processing I have a custom structure defined similar to:

typedef struct {
  RGB *image;
  double property;
} Deep;

Which I am accessing within an array Deep *structPointer.

I have come across some documents for manually copying the entire contents of structPointer to the GPU, which has left me with the following code.

  Deep *structPointer = (Deep*)
    malloc(total_size*sizeof(Deep));
  assert(structPointer);

  int i;

  for (i = 0; i < total_size; i++)
  {
    structPointer[i].image = randomImage(width, height, max);
  }

    dP = acc_copyin( stuctPointer, sizeof( Deep )*total_size ); 

  for ( i=0; i < total_size; i++ ) {
   dA = acc_copyin( structPointer[i].image, sizeof(RGB)*width*height );     //device address in dA
   acc_memcpy_to_device( &dP[i].image, &dA,  sizeof(RGB*) );
  }

This all runs fine, until I try to run a parallel for loop that accesses structPointer and modifies the property attributes of the members of the array based on the contents of RGB *image.

Pseudo code:

#pragma acc parallel loop copyin(inputImage[0:width*height], width, height)
for (i = 0; i < total_size; i++) {
  computeProperty(input_image, structPointer+i, width, height)
}

inline void compProperty (const RGB *A, Deep *B, int width, int height)
{
   B->property = 10;
}

I get:

call to cuStreamSynchronize returned error 700: Illegal address during kernel execution

The output of cuda-memcheck is:

> ========= CUDA-MEMCHECK image2.ppm is a PPM file 256 x 256 image, max value= 255
> ========= Program hit CUDA_ERROR_INVALID_CONTEXT (error 201) due to "invalid device context" on CUDA API call to cuCtxAttach.
> =========     Saved host backtrace up to driver entry point at error
> =========     Host Frame:/usr/lib64/libcuda.so (cuCtxAttach + 0x156) [0x13fc36]
> =========     Host Frame:./genimg_acc [0x13639]
> =========
> ========= Program hit CUDA_ERROR_ILLEGAL_ADDRESS (error 700) due to "an illegal memory access was encountered" on CUDA API call to
> cuStreamSynchronize. call to cuStreamSynchronize returned error 700:
> Illegal address during kernel execution
> =========     Saved host backtrace up to driver entry point at error
> =========     Host Frame:/usr/lib64/libcuda.so (cuStreamSynchronize + 0x13d) [0x149a9d]
> =========     Host Frame:./genimg_acc [0x15856]
> =========
> ========= Program hit CUDA_ERROR_ILLEGAL_ADDRESS (error 700) due to "an illegal memory access was encountered" on CUDA API call to
> cuCtxSynchronize.
> =========     Saved host backtrace up to driver entry point at error
> =========     Host Frame:/usr/lib64/libcuda.so (cuCtxSynchronize + 0x127) [0x13ee37]

Note that the program runs when compiled without openACC and will process correctly when run in a single thread.

challett
  • 906
  • 6
  • 16

1 Answers1

1

OK I found a reference for OpenACC Deep Copying which may be what you're looking at already based on the Deep name. Looking at Figure 9 on page 7, they give you an example of doing a deep copy on a structure containing both scalars and pointers.

One must use the pointer returned by acc_copyin to access the array of structures within the parallelized code--namely dP instead of structPointer. The following code should fix the problem.

#pragma acc parallel loop copyin(inputImage[0:width*height], width, height)
for (i = 0; i < total_size; i++) {
  computeProperty(input_image, dP+i, width, height)
}
Kurt Stutsman
  • 3,994
  • 17
  • 23
  • Nevermind, it looks like there is just an `acc_copyin()` and `acc_copyout()`. I'm looking over the spec for OpenACC to see if I can give you any other ideas. – Kurt Stutsman Feb 27 '16 at 23:21
  • I appreciate the effort. Thank you! – challett Feb 27 '16 at 23:26
  • Oh I just realized the problem... well at least a big one. You need to be using `dP` not `structPointer` in your parallelized code. – Kurt Stutsman Feb 28 '16 at 00:08
  • That has fixed it. Thank you! The source I was using had listed that `structPointer` was "now available on the device". Update your answer and I will accept it. – challett Feb 28 '16 at 00:13