I'm new in the world of OpenCL and I would like to increase my knowledge about it.
I have tried to find information about how build 'complex functions' using OpenCL. For 'complex functions', I mean functions which could be parallelized and have a function inside that can be parallelized too. I have seen links like:
And now, here I go with my question, I'm going to use an example:
// A and B are int vectors
// The value of M and N are different!! M != N
for(int i=0; i<=M-2;i++){
for(int j=i+1;j<=M-1;j++){
distance=calculate_distance(A[i],B[j]);
//more sequential instructions
}
}
And the calculate_distance concatenate both vectors and has a loop:
for(int i=0; i<=N-1;i++)
// Some sequential instructions
Could this full fragment of code be parallelized? In that case How (this is the reason of the tittle kernel inside kernel)?
Note: I'm using Intel(R) SDK for OpenCL - Offline Compiler 2012 ( Windows) to check my kernels.
Thanks in advance