3

The smallest positive value of an IEEE-754 32-bit float is 2^−149 and that of a 16-bit float is 2^-14. If the IEEE-754 standardized a 24-bit float, what would its smallest positive value be?

Excuse me if this is explicitly documented somewhere, but I wasn't able to find it in searches.

If you want to know about the practicality, this will be used to fit four floating-point values of varying precision into 64 bits for use in a memory-deficient system.

Ky -
  • 30,724
  • 51
  • 192
  • 308
  • 3
    The smallest positive 32-bit float is 2**-149, not 2**-126. 2**-126 is the smallest positive normal 32-bit float. – Eric Postpischil Sep 19 '12 at 22:18
  • If feasible you can use a common exponent for the 3 values like [RGBE format](https://en.wikipedia.org/wiki/RGBE_image_format) and unpack if needed when doing operations. For example if you don't need negative values and choose 7 bits for exponent then you have 19 bits for mantissas. Or you can use 3 sign bits and 7-bit exponent with 18-bit mantissa – phuclv Jun 01 '14 at 10:18
  • Oh I misread that you fit 3 floating-point values into 64 bits, but how can you fit 4 floats into 64 bits? Even if you use only one 24-bit number then you have only 40 bits remaining for the 3 other values, which results in precision even less than half-precision float. And if you use varying precision it may take even more bits to encode the length of the floats – phuclv Jun 01 '14 at 10:25
  • @LưuVĩnhPhúc Not all floats will be 24-bits. They will be two 24-bit floats and two 8-bit floats. Note that I said "of varying precision" in the same sentence. – Ky - Jun 10 '14 at 02:05
  • Yes I see "varying precision" before. But I don't think 8-bit float is a useful type because of the limited precision. If you use 1.4.3 format then it's just precise to a little more than one decimal digit. – phuclv Jun 10 '14 at 02:20
  • @LưuVĩnhPhúc I think you're veering too far off the topic. I didn't ask _if_ I should use a 24-bit float; I asked _how_ I would use it – Ky - Jun 10 '14 at 19:27

1 Answers1

3

IEEE-754 doesn't actually answer this question; it doesn't provide for standardizing a 24-bit format, and the usual formulas for determining the number of significand bits in a floating-point format break down for small widths.

That said, the most natural choice would be to have seven exponent bits and sixteen explicit significand bits, which makes the smallest positive normal number 2^-62.

Stephen Canon
  • 103,815
  • 19
  • 183
  • 269
  • Can you elaborate on why seven exponent bits would be more natural than six (say)? I see that greater than 5 (float16) and fewer than 8 (float32) makes sense, but I'm curious what criterion you used to pick 7 rather than 6 exponent bits. – Mark Dickinson Sep 19 '12 at 21:28
  • 1
    One primary reason to use floating-point numbers is that they offer greater dynamic range than integers of the same width. With a six bit exponent, the representable range is `2^-30` ... `2^31`; in our modern era of fairly cheap 64-bit hardware, this doesn't offer much to the user. The extra dynamic range from having another exponent bit is a good tradeoff, which I would expect most users to prefer to a seventeenth significand bit. – Stephen Canon Sep 19 '12 at 22:21
  • 1
    It's almost tempting to suggest an 8 bit exponent and 16-bit precision (including hidden bit), for ease of conversion to / from binary32 format. – Mark Dickinson Sep 20 '12 at 08:02