OpenCL version of cudaMemcpyToSymbol & optimization

Question

Can someone tell me OpenCl version of cudaMemcpyToSymbol for copying __constant to device and getting back to host?
Or usual clenquewritebuffer(...) will do the job ?
Could not find much help in forum. Actually a few lines of demo will suffice.

Also shall I expect same kind of optimization in opencl as that of CUDA using constant cache?

Thanks

score 3 · Answer 1 · edited Sep 07 '12 at 02:18

I have seen people use cudaMemcpyToSymbol() for setting up constants in the kernel and the compiler could take advantage of those constants when optimizing the code. If one was to setup a memory buffer in openCL to pass such constants to the kernel then the compiler could not use them to optimize the code.

Instead the solution I found is to replace the cudaMemcpyToSymbol() with a print to a string that defines the symbol for the compiler. The compiler can take definitions in the form of -D FOO=bar for setting the symbol FOO to the value bar.

score 0 · Accepted Answer · answered May 02 '12 at 12:04

Not sure about OpenCL.Net, but in plain OpenCL: yes, clenquewritebuffer is enough (just remember to create buffer with CL_MEM_READ_ONLY flag set).

Here is a demo from Nvidia GPU Computing SDK (OpenCL/src/oclQuasirandomGenerator/oclQuasirandomGenerator.cpp):

c_Table[i] = clCreateBuffer(cxGPUContext, CL_MEM_READ_ONLY, QRNG_DIMENSIONS * QRNG_RESOLUTION * sizeof(unsigned int),     
                 NULL, &ciErr);
ciErr |= clEnqueueWriteBuffer(cqCommandQueue[i], c_Table[i], CL_TRUE, 0, 
            QRNG_DIMENSIONS * QRNG_RESOLUTION * sizeof(unsigned int), tableCPU, 0, NULL,  NULL);

Constant memory in CUDA and in OpenCL are exactly the same, and provide the same type of optimization. That is, if you use nVidia GPU. On ATI GPUs, it should act similarly. And I doubt that constant memory would give you any benefit over global when run on CPU.

How does the CPU treats Local and constant cashes(which are absent in CPU )? — gpuguy, May 04 '12 at 05:51
@gpuguy The actual relation of OpenCL memory concepts to underlying hardware architecture is not explicitly specified, AFAIK. I believe that on CPU they are just parts of usual RAM, and are cached the same way as any other access to RAM (global, texture etc). — aland, May 04 '12 at 06:06

OpenCL version of cudaMemcpyToSymbol & optimization

2 Answers2