I am trying to understand thread divergence. I have a few differents questions.
- About thread divergence, is there any performance benefit to disable a thread if he don't need to do the computation? For example:
__global__ void kernel_1()
{
int i = f();
// We know if this condition is false, i is less than g()
if(threadId.x < 5)
{
i = min(g(), i);
}
}
__global__ void kernel_2()
{
int i = f();
i = min(g(), i);
}
Which kernel is the better?
- Does CUDA defines "thread divergence" only considerating code source path? For example:
__global__ void kernel_3()
{
if(threadIdx.x < 5)
{
int i = g();
printf("hello\n");
}
else
{
int i = g();
printf("hello\n");
}
}
In this code, both branchs have exactly the same code. So does the warp diverges or not?