0

I'm a beginner and working on AVX2 architecture and I would like to use an intrinsic which does the same functionality of the _mm_min_round_ss in AVX-512. So Is there any intrinsic which is similar to this?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
velac
  • 11
  • 1

1 Answers1

3

Rounding-mode override and FP-exception suppression (with per-instruction overrides) are unique to AVX-512. (These are the ..._round_... versions of scalar and 512-bit intrinsics; packed 128-bit and 256-bit vector instructions don't have room to encode the SAE stuff in the EVEX prefix, they need some of those bits to signal the narrower vector length.)

Does the rounding mode ever make a difference for vminps? I think no, since it's a compare, not actually rounding a new result. I guess suppressing exceptions can, in case you're going to check fenv later to see if anything set the denormal or invalid flags or something? The Intrinsics guide only mentions _MM_FROUND_NO_EXC as relevant, not overrides to floor/ceil/trunc rounding.


If you don't need exception suppression, just use the normal scalar or packed ..._min_ps / ss intrinsic, e.g. _mm256_min_ps (8 floats in a __m256 vector) or _mm_min_ss (scalar, just the low element of a __m128 vector, leaving others unmodified).

See What is the instruction that gives branchless FP min and max on x86? for details on exact FP semantics (not symmetric wrt. NaN), and the fact that until quite recently, GCC treated the intrinsic as commutative even though the instruction isn't. (Other compilers, and current GCC, only do that with -ffast-math)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847