2

All, I have the following lines of code for setting up a 3D image in OpenCL:

const size_t NPOLYORDERS = 16;
const size_t NPOLYBINS = 1024;


cl::Image3D my3DImage;

cl::ImageFormat imFormat(CL_R, CL_FLOAT);

my3Dimage = cl::Image3D(clContext, CL_MEM_READ_ONLY, imFormat, NPOLYORDERS, NPOLYORDERS, NPOLYBINS);

The code runs fine when I use the Intel OpenCL CPU driver (by creating a context with CL_DEVICE_TYPE_CPU), but fails with a segfault when I use the nVidia driver with a TITAN black (by creating a context with CL_DEVICE_TYPE_GPU).

All of this is on RHEL6.4 with a 2.6.32-358 kernel using the latest nVidia driver available, using the Intel OpenCL runtime 14.1_x64_4.4.0.118 and 2014_4.4.0.134_x64 Intel OpenCL SDK.

All of the other code appears to be working on the nVidia device. I can compile the kernel, create contexts, buffers, etc, but this one constructor seems to fail. I checked what the max sizes allowed for an Image3D were using cl::Device::getInfo, and it reports that HxWxD limits are 4096x4096x4096, so I'm well below the limit with my 16x16x1024 image size.

I also checked to make sure the CL_R and CL_FLOAT types were supported formats, which they appear to be.

At first I thought it was failing because of trying to copy the host memory, but the segfault is occurring before I even enqueue the image read.

The best I've been able to determine from my gdb back trace is that the problem appears to be in line 4074 of CL/cl.hpp:

#0 0x000000000000 in ?? ()
#1 0x00000000004274fe in cl::Image3D::Image3D (this=0x7fffffffffdcb0, context=...,     
   flags=140737488345384, format=..., width=0, height=140737488345392, depth=1024, row_pitch=0,
   slice_pitch=0, host_ptr=0x0, err=0x0) at /usr/include/CL/cl.hpp:4074
#2 0x0000000000421986 in clCorrelationMatrixGenerator::initializeOpenCL (
   this=0x7fffffffffdfa8) at ./libs/matrix_generator/OpenCLMatrixGenerator.cc:194

As you can see, the width and height arguments to Image3D's constructor look wonky, but I'm not sure those are the real values and not optimized out values due to the compiler.

My questions are thus:

Is there something I'm doing wrong with regards to nVidia cards, that doesn't apply on the Intel CPU OpenCL driver? Is there a known binary incompatibility between the Intel SDK and the nVidia OpenCL ICD?

Misha Brukman
  • 12,938
  • 4
  • 61
  • 78
stix
  • 1,140
  • 13
  • 36
  • Does it work using the C API? – Austin Aug 29 '14 at 23:28
  • There's a few different `cl.hpp`s out there, can you show us line 4074 of yours? My guess is that it's trying to use `clCreateImage`, which won't be implemented on NVIDIA platforms. – jprice Aug 30 '14 at 09:54
  • If it's not implemented, how does one create an image on nVidia? O_o It seems to break the idea of ABI if you can't use clCreateImage, doesn't it? – stix Aug 30 '14 at 18:08
  • Does the C API work? – Austin Aug 30 '14 at 22:35
  • 2
    @stix, `clCreateImage` is new to OpenCL 1.2, which NVIDIA don't yet support. In 1.0 and 1.1 there is `clCreateImage2D` and `clCreateImage3D`, which NVIDIA do implement. The C++ wrapper might be using the 1.2 version, rather than the legacy versions. – jprice Aug 31 '14 at 08:25

1 Answers1

2

As some of the commenters have pointed out, the nVidia OpenCL implementation doesn't support clCreateImage, which is used by the underlying cl::Image constructor. This is because nVidia only supports up to OpenCL 1.1, and the functions in question are part of OpenCL 1.2.

There is, however, a way around this without major refactoring of the code. The cl.hpp in the Intel SDK supports using OpenCL 1.1 for the wrapped functionality of the C++ openCL implementation. This can be enabled by defining CL_USE_DEPRECATED_OPENCL_1_1_APIS.

stix
  • 1,140
  • 13
  • 36