If std::numeric_limits::is_iec559 is true, does that mean that I can extract exponent and mantissa in a well defined way?

Question

I've built a custom version of frexp:

auto frexp(float f) noexcept
{
    static_assert(std::numeric_limits<float>::is_iec559);

    constexpr uint32_t ExpMask = 0xff;
    constexpr int32_t ExpOffset = 126;
    constexpr int MantBits = 23;

    uint32_t u;
    std::memcpy(&u, &f, sizeof(float)); // well defined bit transformation from float to int

    int exp = ((u >> MantBits) & ExpMask) - ExpOffset; // extract the 8 bits of the exponent (it has an offset of 126)

    // divide by 2^exp (leaving mantissa intact while placing "0" into the exponent)
    u &= ~(ExpMask << MantBits); // zero out the exponent bits
    u |= ExpOffset << MantBits; // place 126 into exponent bits (representing 0)

    std::memcpy(&f, &u, sizeof(float)); // copy back to f
    return std::make_pair(exp, f);
}

By checking is_iec559 I'm making sure that float fulfills

the requirements of IEC 559 (IEEE 754) standard.

My question is: Does this mean that the bit operations I'm doing are well defined and do what I want? If not, is there a way to fix it?

I tested it for some random values and it seems to be correct, at least on Windows 10 compiled with msvc and on wandbox. Note however, that (on purpose) I'm not handling the edge cases of subnormals, NaN, and inf.

If anyone wonders why I'm doing this: In benchmarks I found that this version of frexp is up to 15 times faster than std::frexp on Windows 10. I haven't tested other platforms yet. But I want to make sure that this not just works by coincident and may brake in future.

Edit:

As mentioned in the comments, endianess could be an issue. Does anybody know?

[Tangent] When you measured performance, did you do so in a release (optimized) build? — NathanOliver, Oct 09 '19 at 16:00
I don't think anything in IEEE-754 would forbid an implementation from storing `uint32_t` little-endian but `float` big-endian, nor do I think anything would forbid a contrived implementation from adding padding to `float` which could cause the code to overwrite storage following `u`. On the other hand, the fact that the Standard doesn't forbid implementations from breaking code doesn't mean that the code shouldn't be reliable on quality general-purpose implementations for all remotely-commonplace platforms. — supercat, Oct 09 '19 at 17:08
`0xff << 23`, `126 << 23` are UB with 16-bit `int/unsigned`. Code fails to perform as desired for sub-normals, NaN, infinity. — chux - Reinstate Monica, Oct 09 '19 at 22:48
I'm voting to close this question as off-topic because it is [seeking peer review of your code](https://codereview.stackexchange.com/help/on-topic) instead of asking [a specific programming problem](https://stackoverflow.com/help/on-topic). However, this question should be a good fit for [codereview.se]. — L. F., Oct 11 '19 at 10:18
@chux you are absolutely right! I adjusted the code accordingly. Also I failed to mention that I'm not handling subnormals on purpose. I fixed that in the question. — sebrockm, Oct 11 '19 at 10:55
@L.F. I agree, the wording of the title suggests that I'm looking for a code review. However, I'm actually only interested whether the bit pattern is well defined by the standard. I rephrased the title accordingly. — sebrockm, Oct 11 '19 at 10:58

parktomatomi · Accepted Answer · 2019-12-17T00:35:32.453

"Does this mean that the bit operations I'm doing are well defined..."

The TL;DR;, by the strict definition of "well defined": no.

Your assumptions are likely correct but not well defined, because there are no guarantees about the bit width, or the implementation of float. From § 3.9.1:

there are three floating point types: float, double, and long double. The type double provides at least as much precision as float, and the type long double provides at least as much precision as double. The set of values of the type float is a subset of the set of values of the type double; the set of values of the type double is a subset of the set of values of the type long double. The value representation of floating-point types is implementation-defined.

The is_iec559 clause only qualifies with:

True if and only if the type adheres to IEC 559 standard

If a literal genie wrote you a terrible compiler, and made float = binary16, double = binary32, and long double = binary64, and made is_iec559 true for all the types, it would still adhere to the standard.

does that mean that I can extract exponent and mantissa in a well defined way?

The TL;DR;, by the limited guarantees of the C++ standard: no.

Assume you use float32_t and is_iec559 is true, and logically deduced from all the rules that it could only be binary32 with no trap representations, and you correctly argued that memcpy is a well defined for conversion between arithmetic types of the same width, and won't break strict aliasing. Even with all those assumptions, the behavior might be well defined but it's only likely and not guaranteed that you can extract the mantissa this way.

The IEEE 754 standard and 2's complement regard bit string encodings, and the behavior of memcpy is described using bytes. While it's plausible to assume the bit string of uint32_t and float32_t would be encoded the same way (e.g. same endianness), there's no guarantee in the standard for that. If the bit strings are stored differently and you shift and mask the copied integer representation to get the mantissa, the answer will be incorrect, despite the memcpy behavior being well defined.

As mentioned in the comments, endianess could be an issue. Does anybody know?

At least a few architectures have used different endianness for floating point registers and integer registers. The same link says that except for small embedded processors, this isn't a concern. I trust Wikipedia entirely for all topics and refuse to do any further research.

IMO this is excessively pessimistic. Also, `memcpy` never violates strict aliasing in C++ (that's a C thing). — M.M, Dec 16 '19 at 22:00
@M.M Oh yeah, totally. But OP asked if it was _well defined_ and stuck a `language-lawyer` tag on it. — parktomatomi, Dec 16 '19 at 22:02
@M.M regarding strict aliasing, it's not a guarantee, because at least a few architectures use different endianness for float and int registers: https://en.wikipedia.org/wiki/Endianness#Floating-point_and_endianness . As the link states, it's unreasonably pessimistic, but again, `language-lawyer`. — parktomatomi, Dec 16 '19 at 22:30
That's still not a strict aliasing problem , it just means you get different bits than you expected — M.M, Dec 16 '19 at 22:31
@M.M Exactly! I know it's unreasonably pessimistic and pedantic (even the link says so), but to get the mantissa of a binary32 you shift/mask the bits of the integer representation. If the bits are different, you get the wrong answer. — parktomatomi, Dec 16 '19 at 22:35
@M.M After a second read, I think you have a good point about about behavior vs "the problem". I changed my answer to be a little more nuanced about it. — parktomatomi, Dec 17 '19 at 00:08

If std::numeric_limits::is_iec559 is true, does that mean that I can extract exponent and mantissa in a well defined way?

Edit:

1 Answers1

Linked