OpenCL/OpenGL Interop with Multiple GPUs

Question

I'm having trouble using multiple GPUs with OpenCL/OpenGL interop. I'm trying to write an application which renders the result of an intensive computation. In the end it will run an optimization problem, and then, based on the result, render something to the screen. As a test case, I'm starting with the particle simulation example code from this course: http://web.engr.oregonstate.edu/~mjb/sig13/

The example code creates and OpenGL context, then creates a OpenCL context that shares the state, using the cl_khr_gl_sharing extension. Everything works fine when I use a single GPU. Creating a context looks like this:

3. create an opencl context based on the opengl context:
  cl_context_properties props[ ] =
  {
      CL_GL_CONTEXT_KHR, (cl_context_properties) glXGetCurrentContext( ),
      CL_GLX_DISPLAY_KHR, (cl_context_properties) glXGetCurrentDisplay( ),
      CL_CONTEXT_PLATFORM, (cl_context_properties) Platform,
      0
  };

  cl_context Context = clCreateContext( props, 1, Device, NULL, NULL, &status );
  if( status != CL_SUCCESS) 
  {
      PrintCLError( status, "clCreateContext: " );
      exit(1);
  }

Later on, the example creates shared CL/GL buffers with clCreateFromGLBuffer.

Now, I would like to create a context from two GPU devices:

cl_context Context = clCreateContext( props, 2, Device, NULL, NULL, &status );

I've successfully opened the devices, and can query that they both support cl_khr_gl_sharing, and both work individually. However, when attempting to create the context as above, I get

CL_INVALID_OPERATION

Which is an error code added by the cl_khr_gl_sharing extension. In the extension description (linked above) it says

CL_INVALID_OPERATION if a context or share group object was specified for one of CGL, EGL, GLX, or WGL and any of the following conditions hold:

The OpenGL implementation does not support the window-system binding API for which a context or share group objects was specified.

More than one of the attributes CL_CGL_SHAREGROUP_KHR, CL_EGL_DISPLAY_KHR, CL_GLX_DISPLAY_KHR, and CL_WGL_HDC_KHR is set to a non-default value.

Both of the attributes CL_CGL_SHAREGROUP_KHR and CL_GL_CONTEXT_KHR are set to non-default values.

Any of the devices specified in the argument cannot support OpenCL objects which share the data store of an OpenGL object, as described in section 9.12."

That description doesn't seem to fit any of my cases exactly. Is it not possible to do OpenCL/OpenGL interop with multiple GPUs? Or is it that I have heterogeneous hardware? I printed out a few parameters from my enumerated devices. I've just taken two random GPUs that I could get my hands on.

PlatformID: 18483216
Num Devices: 2

-------- Device 00 ---------
CL_DEVICE_NAME: GeForce GTX 285
CL_DEVICE_VENDOR: NVIDIA Corporation
CL_DEVICE_VERSION: OpenCL 1.0 CUDA
CL_DRIVER_VERSION: 304.88
CL_DEVICE_MAX_COMPUTE_UNITS: 30
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1476
CL_DEVICE_TYPE: CL_DEVICE_TYPE_GPU

-------- Device 01 ---------
CL_DEVICE_NAME: Quadro FX 580
CL_DEVICE_VENDOR: NVIDIA Corporation
CL_DEVICE_VERSION: OpenCL 1.0 CUDA
CL_DRIVER_VERSION: 304.88
CL_DEVICE_MAX_COMPUTE_UNITS: 4
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1125
CL_DEVICE_TYPE: CL_DEVICE_TYPE_GPU

cl_khr_gl_sharing is supported on dev 0.
cl_khr_gl_sharing is supported on dev 1.

Note that if I create the context without the interop portion (such that the props array looks like below) then it successfully creates the context, but obviously cannot share buffers with the OpenGL side of the application.

cl_context_properties props[ ] =
{
   CL_CONTEXT_PLATFORM, (cl_context_properties) Platform,
   0
};

Not necessarily the reason of your problem, but If I were your I'd first update my drivers... They're quite old, for instance the current driver version for linux 64 bit for your GTX 285 is 319.32 (v 325.08 in beta). — CaptainObvious, Jul 30 '13 at 22:15
Does your OpenCL context include both GPUs? You need to do that for the implementation to move data between the two GPUs so that results from the one GPU can be rendered by the other. — chippies, Jul 31 '13 at 16:12
@chippies I'm getting the error when trying to create the OpenCL context. In the code up there, Device is an array that contains both GPUs. — matth, Jul 31 '13 at 20:00
@CaptianObvious I'm using the nvidia driver that is packaged for Fedora in rpmfusion, which appears to be from April. I'll try the newer version from nvidia's website, but I'm skeptical that much will have changed in three months. Thanks. — matth, Jul 31 '13 at 20:06
Similarhappened to me when I try two kernels for single GPU with single interop context. Cannot create GLbuffer. — huseyin tugrul buyukisik, Aug 14 '13 at 19:31
I can't tell you if it's possible to share a context across two different GPUs, but you can damn sure share one across two equal GPUs. You can even share the context across two GPUs and a CPU. I successfully shared it across my two Nvidia GTX 460's. But I never had the chance to test with heterogenous GPUs. — Tara, Mar 09 '14 at 14:35

score 2 · Answer 1 · edited May 23 '17 at 12:26

Several related Questions and Examples

Here's a related example of a pure OpenGL approach to shared processing between multiple gpus
Another pure OpenGL mulitiple gpu question
A producer/consumer example using multiple gpus see the producer source file for calls to make current (looks windows specific but the flow will be similar elsewhere). See glContext for details


    bool stageProducer::preExecution() 
    {
        if(!glContext::getInstance().makeCurrent(_rc))
        {
            window::getInstance().messageBoxWithLastError("wglMakeCurrent");
            return false;
        }
        glBindFramebuffer(GL_DRAW_FRAMEBUFFER, _fboID);
        return true;
    }

OpenCL specific, but relevant to this question:

"If you enqueue a write to the buffer on queueA(deviceA) then OpenCL will use that device to do the write. However, if you then use the buffer on queueB(deviceB) in the same context, OpenCL will recognize that deviceA has the most recent data and will move it over to deviceB before using it. In short, as long as you use events to ensure that no two devices are trying to access the same memory object at the same time, OpenCL will make sure that each use of the memory object has the most recent data, regardless of which device last used it."

I assume when you take OpenGL out of the equation sharing memory between gpus works as expected?

score 1 · Accepted Answer · answered Mar 16 '15 at 14:30

1

When you call these two lines:

CL_GL_CONTEXT_KHR, (cl_context_properties) glXGetCurrentContext( ), CL_GLX_DISPLAY_KHR, (cl_context_properties) glXGetCurrentDisplay( ),

the calls need to come from inside a new thread with a new OpenGL context. You can usually only associate one OpenCL context with one OpenGL context for one device at a time per thread.

answered Mar 16 '15 at 14:30

That Realty Programmer Guy

1,571
2
22
43

Thanks for taking the time to answer this. I haven't used OpenCL in awhile, and I don't have an easy way to test this anymore, but your answer sounds plausible so I'll just accept it. – matth Mar 16 '15 at 21:21

OpenCL/OpenGL Interop with Multiple GPUs

2 Answers2

Linked