I'm new with OpenCL and have some problems with the array additions I use the code provided in the link below
and I added some parts to measure the performance of the GPU
clFinish(commandQueue);
// Queue the kernel up for execution across the array
cl_ulong start, end; cl_event k_events;
errNum = clEnqueueNDRangeKernel(commandQueue, kernel, 1, NULL,
globalWorkSize, localWorkSize,
0, NULL, &k_events);
clGetEventProfilingInfo(k_events, CL_PROFILING_COMMAND_START,
sizeof(cl_ulong), &start, NULL);
clWaitForEvents(1 , &k_events);
clGetEventProfilingInfo(k_events, CL_PROFILING_COMMAND_END,
sizeof(cl_ulong), &end, NULL);
clGetEventProfilingInfo(k_events, CL_PROFILING_COMMAND_START,
sizeof(cl_ulong), &start, NULL);
float GPUTime = (end - start);
And this to measure the CPU time
LARGE_INTEGER CPUstart, finish, freq;
QueryPerformanceFrequency(&freq);
QueryPerformanceCounter(&CPUstart);
for (int i=0;i<ARRAY_SIZE;i++){
result[i]=a[i]+b[i];
}
QueryPerformanceCounter(&finish);
double timeCPU=(finish.QuadPart - CPUstart.QuadPart) /((double)freq.QuadPart)/1000000000.0) ;
The first problem I encountered is the array size ; it can't go beyond 10000 ; if I do this ; it just crash . How to fix it ?
The second problem is the performance ; the GPU/CPU ratio range is too wide ; from 13% to 210%(ish) . Why does this happen and can you suggest a fix ?
Edit : I figured out the 2nd ; the lag was caused by the power saving mode ; it set the core/mem to much lower than default . Just use a program to lock it ; and the performance are rocking stable at ~150-300 % (GPU/CPU)
Good case
GPU time :632667 nanosecs.
CPU time : 990023 nanosecs.
GPU/CPU ratio : 156.484 percent.
And bad one :
GPU time :6.83267e+006 nanosecs.
CPU time : 1.00756e+006 nanosecs.
GPU/CPU ratio : 14.7462 percent.
Any ideas will be appreciated . Thank you :D
PS : The CPU is core i3-370M ; GPU : HD5470 . I use VS2008 on windows 7