3

I write code in LINUX RHEL 64bit, and use C++98.

I have an array of floating point values, and I wanted to 'mark' some values to be 'invalid'. One possible solution is to use another bit-array to tell if the corresponding value is valid.

I was wondering if we can use any special double value. The link Why does IEEE 754 reserve so many NaN values? says that there are lot of NaN values. Can we use any value reserved for my problem?

  • I only need one bit in the payload to indicate if a double value is 'valid' in terms of my definition. The input double value can contain NaN, but I assume it would not use any payload.
  • the double values will be saved into a file in a binary mode (saving double values bit by bit).
  • then the code also reads data from the file in a binary mode
  • For each double value read from the file, we first check if it is valid by the bit set in payload before doing any other calculation.
Joe C
  • 2,757
  • 2
  • 26
  • 46
  • 1
    Note that NaN "payloads" are not necessarily supported on every platform that supports NaNs. IEEE-754 allows the use of a single canonical NaN. In practice, on *most* platforms NaN payloads are supported and you can usually set them, in an implementation-defined manner, via the standard library function [`nan()`](http://en.cppreference.com/w/cpp/numeric/math/nan) of C and C++. – njuffa Jul 18 '17 at 21:07
  • The nan in C++ works for C++11. We only have std::numeric_limits::quiet_NaN() for old C++. And it does not support setting payloads. Is it safe to set it by bitmask...? – Joe C Jul 18 '17 at 21:41
  • Yes, you could use bit operations to access the NaN encodings, provided normal caveats about type-reinterpretation are taken into account. I assumed C++11 is what people use at this point. I think it would benefit the question if you could clarify what programming language(s) and OS platforms you plan to use, how many different encodings you need, and how you plan to use the encodings (e.g. is there a need to not just set, but also *examine* the specific encodings used from *inside* the program? There's no C/C++ standard function for that, I think). – njuffa Jul 18 '17 at 22:04
  • @njuffa I updated my original question to address your questions. – Joe C Jul 18 '17 at 22:13
  • 1
    Mentioning the target platform in the question would be helpful, since some software environments already pre-define some of the available NaN encodings for their own purposes. The specific example I gave in an [answer](https://stackoverflow.com/a/40791211/780717) to a related question was various Apple platforms. – njuffa Jul 18 '17 at 22:13

1 Answers1

3

Generally speaking, IEEE-754 allows, but does not require, support for NaN payloads. However, here we have the specific case of x64 systems, and the relevant processors from AMD and Intel support NaN payloads.

IEEE Std 754-2008 further specifies that with NaN encodings, the most significant fraction bit of the mantissa distinguishes between quiet and signalling NaNs. This corresponds to the most significant stored mantissa bit for single- and double-precision types. It follows that one cannot use this bit for custom encoding purposes. x64 processor generate the specific QNaN INDEFINITE in response to various exceptional situations, and the sign bit of the QNaN encoding is used for that, so the sign bit is also off-limits for custom NaN-based flagging.

Various toolchains provide relaxed, non-IEEE-754 compliant "fast math", in which propagation of NaNs is not guaranteed. You would need to compile with the strictest floating-point setting (e.g. Intel compiler -fp-model strict) to ensure the custom flagging does not get lost. Various software environments use NaN payloads to encode the particular event that gave rise to the creation of a NaN (the SANE by Apple is an historical example of such a system). In my experience, such systems typically utilize low-order bits of the mantissa portion of a NaN encoding.

This would suggest that high-order mantissa bits, say, bits 50:48 of an IEEE-754 double-precision number or bits 21:19 of an IEEE-754 single-precision number is the best place to place custom flags inside a NaN encoding (leaving untouched the most significant mantissa bit, as mentioned). Transport of data through both float and double types may be problematic as propagation of NaN payloads between different floating-point types is not specified by the x64 architecture specification best I can find out from reviewing AMD's original x64 architecture specification and Intel latest documentation. Purely empirically, I find that NaN payloads are handled such that bit [n] of a single-precision encoding appears as bit [n+29] of the double-precision encoding, and vice versa.

Given the constraints on programming language, it will be best to use memcpy() to transfer between floating-point and unsigned integer representations, and perform the required bit-level operations to set, clear, test custom NaN payloads in integer space. Many optimizing compilers will optimize the memcpy() away and replace it by hardware instructions that transfer data between x84 floating-point and integer registers, but you would want to double-check the generated machine code to make sure of that if the performance of these operations matters.

njuffa
  • 23,970
  • 4
  • 78
  • 130
  • For single-precision floating points (the float type in C), is it still safe to use those higher significant bits? – Joe C Jul 18 '17 at 23:01
  • 1
    @JoeC Obviously, the high-order bits of the single precision mantissa are in a different place. I'll add that to the answer to avoid ambiguity. – njuffa Jul 18 '17 at 23:05
  • If bit[n+29] in double is mapped to bit[n] in float, does it mean the bit usage at https://stackoverflow.com/questions/40785756/is-ieee-754-floating-point-representation-wasting-memory/40791211#40791211 works only for double? It uses 8th-15th bits, these bits would not exist in float given the above mapping. Or maybe such as a conversion is not allowed in the Apple system? – Joe C Jul 19 '17 at 01:12
  • 1
    @JoeC I don't know how the Apple SANE system handled floating-point conversions. It may have been tied to what the underlying hardware did, or it may have been handled in software. As I stated in the answer, my findings regarding the payload bit mapping are empirical and apply only to x64 processors. NaN-propagation behavior between data types may be completely different for other processor architectures. – njuffa Jul 19 '17 at 02:01
  • I found in my case a float f 8.81620763e-39 (0x600000) is converted into a double d 8.8162076311671563e-39 (0x3808000000000000) if I do double d = f. Why was 0x380 added? – Joe C Jul 19 '17 at 02:13
  • 1
    @JoeC I would suggest asking a new question if you need to figure out details of general floating-point format conversions. Note that the numbers in your example are *not* NaNs, and `8.81620763e-39f` in particular is a denormal (or subnormal) number (biased exponent of zero) which becomes a normal after conversion to double precision. My comments about bitmapping between `float` and `double` were explicitly restricted to NaN payloads. – njuffa Jul 19 '17 at 02:24
  • Sorry. I made mistakes of creating nan. They are not nans. – Joe C Jul 19 '17 at 03:03