I' m developing with cuda and have an arithmetic problem, which I could implement with or without warp diverengence. With warp divergence it would look like:
float v1;
float v2;
//calculate values of v1 and v2
if(v2 != 0)
v1 += v2*complicated_math();
//store v1
Without warp divergence the version looks like:
float v1;
float v2;
//calculate values of v1 and v2
v1 += v2*complicated_math();
//store v1
The Question is, which version is faster?
In other words how expensive is a warp disable compared to some extra calculation and addition of 0?