I am trying to run a simple Hello World program that should print "Hello World" using the GPU (Nvidia GTX 960 in my case). I am trying to compile this program using Code::Blocks on Windows 10, 64 bit. Apparently there are no newer driver versions out so I am stuck with OpenCL 1.2.
What I have done so far:
Installed the NVIDIA CUDA SDK
Set up Code::Blocks compiler to search the include directory and added the OpenCL.lib to the linker settings.
Copy-pasted the Hello World program from here:
http://dhruba.name/2012/10/06/opencl-cookbook-hello-world-using-cpp-host-binding/
- Created the "opencl_hello_world.cl" file, as descriped in the link.
I am a noob in OpenCL and just wanted to test if it works. Every "Hello-World" code I've tried so far fails with different errors.
#define __CL_ENABLE_EXCEPTIONS
#include <fstream>
#include <iostream>
#include <iterator>
#include <CL/cl.hpp>
#include <CL/opencl.h>
using namespace std;
int main () {
vector<cl::Platform> platforms;
vector<cl::Device> devices;
vector<cl::Kernel> kernels;
try {
// create platform
cl::Platform::get(&platforms);
platforms[0].getDevices(CL_DEVICE_TYPE_GPU, &devices);
// create context
cl::Context context(devices);
// create command queue
cl::CommandQueue queue(context, devices[0]);
// load opencl source
ifstream cl_file("opencl_hello_world.cl");
string cl_string(istreambuf_iterator<char>(cl_file), (istreambuf_iterator<char>()));
cl::Program::Sources source(1, make_pair(cl_string.c_str(),
cl_string.length() + 1));
// create program
cl::Program program(context, source);
// compile opencl source
program.build(devices);
// load named kernel from opencl source
cl::Kernel kernel(program, "hello_world");
// create a message to send to kernel
char* message = "Hello World!";
int messageSize = 12;
// allocate device buffer to hold message
cl::Buffer buffer(CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR,
sizeof(char) * messageSize, message);
// set message as kernel argument
kernel.setArg(0, buffer);
kernel.setArg(1, sizeof(int), &messageSize);
// execute kernel
queue.enqueueTask(kernel);
// wait for completion
queue.finish();
cout << endl;
} catch (cl::Error e) {
cout << endl << e.what() << " : " << e.err() << endl;
}
return 0;
}
The code fails with the following output:
||=== Build: Debug in OpenCLTest (compiler: GNU GCC Compiler) ===|
obj\Debug\main.o||In function `getPlatformVersion':|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|1735|undefined reference to `clGetPlatformInfo@20'|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|1737|undefined reference to `clGetPlatformInfo@20'|
obj\Debug\main.o||In function `getDevicePlatformVersion':|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|1744|undefined reference to `clGetDeviceInfo@20'|
obj\Debug\main.o||In function `ZN2cl6detail16ReferenceHandlerIP13_cl_device_idE7releaseES3_':|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|1619|undefined reference to `clReleaseDevice@4'|
obj\Debug\main.o||In function `ZN2cl6detail16ReferenceHandlerIP11_cl_contextE6retainES3_':|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|1652|undefined reference to `clRetainContext@4'|
obj\Debug\main.o||In function `ZN2cl6detail16ReferenceHandlerIP11_cl_contextE7releaseES3_':|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|1654|undefined reference to `clReleaseContext@4'|
obj\Debug\main.o||In function `ZN2cl6detail16ReferenceHandlerIP17_cl_command_queueE7releaseES3_':|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|1663|undefined reference to `clReleaseCommandQueue@4'|
obj\Debug\main.o||In function `ZN2cl6detail16ReferenceHandlerIP7_cl_memE7releaseES3_':|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|1672|undefined reference to `clReleaseMemObject@4'|
obj\Debug\main.o||In function `ZN2cl6detail16ReferenceHandlerIP11_cl_programE7releaseES3_':|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|1690|undefined reference to `clReleaseProgram@4'|
obj\Debug\main.o||In function `ZN2cl6detail16ReferenceHandlerIP10_cl_kernelE7releaseES3_':|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|1699|undefined reference to `clReleaseKernel@4'|
obj\Debug\main.o||In function `ZN2cl6detail16ReferenceHandlerIP9_cl_eventE7releaseES3_':|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|1708|undefined reference to `clReleaseEvent@4'|
obj\Debug\main.o||In function `ZNK2cl8Platform10getDevicesEyPSt6vectorINS_6DeviceESaIS2_EE':|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|2210|undefined reference to `clGetDeviceIDs@24'|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|2216|undefined reference to `clGetDeviceIDs@24'|
obj\Debug\main.o||In function `ZN2cl8Platform3getEPSt6vectorIS0_SaIS0_EE':|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|2315|undefined reference to `clGetPlatformIDs@12'|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|2322|undefined reference to `clGetPlatformIDs@12'|
obj\Debug\main.o||In function `ZN2cl7ContextC1ERKSt6vectorINS_6DeviceESaIS2_EEPiPFvPKcPKvjPvESC_S7_':|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|2475|undefined reference to `clCreateContext@24'|
obj\Debug\main.o||In function `ZN2cl7ContextC1EyPiPFvPKcPKvjPvES6_S1_':|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|2586|undefined reference to `clCreateContextFromType@24'|
obj\Debug\main.o||In function `ZN2cl6BufferC1EyjPvPi':|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|3158|undefined reference to `clCreateBuffer@24'|
obj\Debug\main.o||In function `ZN2cl6Kernel6setArgEjjPKv':|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|4955|undefined reference to `clSetKernelArg@16'|
obj\Debug\main.o||In function `ZN2cl7ProgramC1ERKNS_7ContextERKSt6vectorISt4pairIPKcjESaIS8_EEPi':|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|5054|undefined reference to `clCreateProgramWithSource@20'|
obj\Debug\main.o||In function `ZNK2cl7Program5buildERKSt6vectorINS_6DeviceESaIS2_EEPKcPFvP11_cl_programPvESB_':|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|5228|undefined reference to `clBuildProgram@24'|
obj\Debug\main.o||In function `ZN2cl6KernelC1ERKNS_7ProgramEPKcPi':|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|5437|undefined reference to `clCreateKernel@12'|
obj\Debug\main.o||In function `ZN2cl12CommandQueueC1ERKNS_7ContextERKNS_6DeviceEyPi':|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|5526|undefined reference to `clCreateCommandQueue@20'|
obj\Debug\main.o||In function `ZNK2cl12CommandQueue11enqueueTaskERKNS_6KernelEPKSt6vectorINS_5EventESaIS5_EEPS5_':|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|6334|undefined reference to `clEnqueueTask@20'|
obj\Debug\main.o||In function `ZNK2cl12CommandQueue6finishEv':|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|6551|undefined reference to `clFinish@4'|
obj\Debug\main.o||In function `ZN2cl6Kernel6setArgINS_6BufferEEEijRKT_':|
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL\cl.hpp|4948|undefined reference to `clSetKernelArg@16'|
||error: ld returned 1 exit status|
||=== Build failed: 27 error(s), 0 warning(s) (0 minute(s), 0 second(s)) ===|
TL;DR as far as I can see: All errors point me to different lines in the "cl.hpp" file (undefined reference to X where X is an OpenCL function).
The problem is that the source of these errors point me to the "cl.hpp" file, which I am not responsible for fixing. If there was an error when finding the library, I could understand it, however there seems to be something wrong with the "cl.hpp" file itself.
I tried including "CL/cl.h" instead, which results in even more errors like "vector was not declared in this scope", "cl has not been declared", "platforms was not declared in this scope", etc.
My questions are:
What's wrong with my code?
Are there known bugs in the cl.hpp file? Is this one of them?
What is the difference between and ?
How can I upgrade to the latest version of OpenCL? Every driver at the moment only has OpenCL 1.2 for some reason.
Is there any information on version changes in OpenCL? It seems like every tutorial I try is either outdated or only work on version 1.0, only works in C, etc.? Are there any tutorials for C++, OpenCL 1.2 in 2016?