convertion of four packed single precision floating point to unsigned double words in x86-SSE

Question

Is there a way to convert four packed single precision floating point values to four double words in x86 with SSE extension? The closest instruction would be CVTPS2PI, but it cannot be executed on two xmm registers, instead should be given as CVTPS2PI MM, XMM/M64. What if I want something like <conversion_mnemonic> XMM, XMM/M128?

Thanks. Iman.

thanks @fuz. cvtps2dq will do the job apprently, but to four packed signed doubleword. Are you aware of a similar instruction for the unsigned double word type? — Iman Abdollahzadeh, Oct 29 '20 at 14:39
If you want unsigned integers, you should specify that in your question. Also, what rounding and overflow behavior do you want? — chtz, Oct 29 '20 at 16:10
Thanks @chtz. Rounding: round to the nearest integer. Overflow: actually I'm not working with large floating points and they are all positive, so doesn't matter. — Iman Abdollahzadeh, Oct 29 '20 at 16:38

score 1 · Answer 1 · answered Oct 29 '20 at 19:07

x86 doesn't have native support for FP<->unsigned until AVX512, with vcvtps2udq (https://www.felixcloutier.com/x86/vcvtps2udq). For scalar you normally just convert to 64-bit signed (cvtss2si rax, xmm0) and take the low 32 bits of that (in EAX), but that's not an option with SIMD.

Without AVX-512, ideally you can use a signed conversion (cvtps2dq) and get the same result. i.e. if your floats are non-negative and <= INT_MAX (2147483647.0).

See How to efficiently perform double/int64 conversions with SSE/AVX? for a related double->uint64_t conversion. The full-range one should be adaptable from double->uint64_t to float->uint32_t if you need it.

Another possibility (for 32-bit float->uint32_t) is just range-shifting to signed in FP, then flipping back in integer. INT32_MIN ^ convert(x + INT32_MIN). But that introduces FP rounding for small integers because INT32_MIN is outside the -2²⁴ .. 2²⁴ range where a float can represent every integer. e.g. 5 would be rounded to the nearest multiple of 2⁸ during conversion. So that's not usable; you'd need to try straight conversion and range-shifted conversion, and only use the range-shifted conversion if straight conversion gave you 0x80000000. (Perhaps using the straight conversion result as a blend control for SSE4 blendvps?)

For packed conversion of float->int32_t, there is SSE2 cvtps2dq xmm, xmm/m128 docs. (cvttps2dq converts with truncation toward 0, instead of the current default rounding mode (nearest, if you haven't changed it).)

Any negative float less than -0.5 will convert to integer -1 or lower; as an uint32_t that bit-pattern represents a huge number. Floats outside the -2³¹..2³¹-1 range get converted to 0x80000000, Intel's "integer indefinite" value.

If you didn't find that, only cvtps2pi signed conversion into an MMX register, you need better places to search:

https://stackoverflow.com/tags/sse/info - links
https://www.felixcloutier.com/x86/ x86 instruction-set list.
https://www.officedaytime.com/simd512e/simd.html - lists of instructions by category / function
https://software.intel.com/sites/landingpage/IntrinsicsGuide/ - asm instruction mnemonics are listed for intrinsics that only expose the functionality of a single instruction. And normally you're better off writing C with intrinsics than asm by hand, especially if you don't already know about relatively common / simple instructions like cvtps2dq and cvttps2dq.
https://agner.org/optimize/ - his asm optimization guide has a chapter on SIMD with a handy table of different kinds of data-movement instructions.
How can I convert an XMM register of single-precision floats to integers? - a pointer in the right direction, but covering only signed conversion. I didn't find an exact duplicate.

You could subtract `2**32` from the float, if the input is larger than `2**31` (one comp+and+sub in addition to the `cvtps2dq`) — chtz, Oct 29 '20 at 19:37
@chtz: What's the advantage to subtracting `2**32` for large floats, instead of `2**31` like I suggested with `INT32_MIN ^ convert(x + INT32_MIN)`? The top half of the uint32 range is `2**31 .. 2**32 - 1`, so subtracting 2^31 never increases the magnitude of any inputs > 2^31, thus doesn't do more rounding. — Peter Cordes, Oct 29 '20 at 19:39
You won't have any rounding problems (since values `>=2**31` will be mapped to values in `[-2**31, 0]`, so you need only one conversion. And subtracting `2**32` automatically gives the correct wrap around behavior as if having done an unsigned conversion (unless the input was larger than `UINT32_MAX`). — chtz, Oct 29 '20 at 19:44
@chtz: Oh I see, integer add of `2**32` is a no-op. Yeah that's a neat idea. [edit] my answer if you like, or post your own. My answer is mainly focused on the point that signed conversion can be used when your numbers are non-negative and not huge; a separate answer might be best for actual working full-range conversion methods. — Peter Cordes, Oct 29 '20 at 19:50

convertion of four packed single precision floating point to unsigned double words in x86-SSE

1 Answers1

Linked