I have started optimising my code using SSE. Essentially it is a ray tracer that processes 4 rays at a time by storing the coordinates in __m128 data types x, y, z (the coordinates for the four rays are grouped by axis). However I have a branched statement which protects against divide by zero I can't seem to convert to SSE. In serial this is:
const float d = wZ == -1.0f ? 1.0f/( 1.0f-wZ) : 1.0f/(1.0f+wZ);
Where wZ is the z-coordinate and this calculation needs to be done for all four rays.
How could I translate this into SSE?
I have been experimenting using the SSE equals comparison as follows (now wz pertains to a __m128 data type containing the z values for each of the four rays):
_mm_cmpeq_ps(_mm_set1_ps(-1.0f) , wZ )
And then using this to identify cases where wZ[x] = -1.0, taking the absolute value of this case and then continue the calculation as normal.
However I have not had much success in this endeavour.