I will ask my question by giving an example. Now I have a function called do_something()
.
It has three versions: do_something()
, do_something_sse3()
, and do_something_sse4()
. When my program runs, it will detect the CPU feature (see if it supports SSE3 or SSE4) and call one of the three versions accordingly.
The problem is: When I build my program with GCC, I have to set -msse4
for do_something_sse4()
to compile (e.g. for the header file <smmintrin.h>
to be included).
However, if I set -msse4
, then gcc is allowed to use SSE4 instructions, and some intrinsics in do_something_sse3()
is also translated to some SSE4 instructions. So if my program runs on CPU that has only SSE3 (but no SSE4) support, it causes "illegal instruction" when calls do_something_sse3()
.
Maybe I have some bad practice. Could you give some suggestions? Thanks.