Kindly help by discussing if 8 bit size for exponent and 23 bit size for mantissa part in IEEE 754 representation is arbitrary or there is any specific reason for these sizes
-
duplicates: [What is the rationale for exponent and mantissa sizes in IEEE floating point standards?](https://stackoverflow.com/q/4397081/995714), [How are IEEE-754 single and double precision formats determined?](https://stackoverflow.com/q/23064893/995714), [Why do higher-precision floating point formats have so many exponent bits?](https://stackoverflow.com/q/40775949/995714), [Where did the free parameters of IEEE 754 come from?](https://retrocomputing.stackexchange.com/q/13493/1981) – phuclv Oct 07 '22 at 13:14
-
Does this answer your question? [What is the rationale for exponent and mantissa sizes in IEEE floating point standards?](https://stackoverflow.com/questions/4397081/what-is-the-rationale-for-exponent-and-mantissa-sizes-in-ieee-floating-point-sta) – phuclv Oct 07 '22 at 13:14
2 Answers
I'm not an expert, but this article makes it sounds like the IEEE-754 choice of floating-point exponent size being 8 is based on an earlier VAX-machine DEC standard. An interesting piece of trivia is that it also sounds like the decision to make double representation have an 11-bit exponent might have contributed to the success of IEEE-754 standard. As for why the 8-bit representation became standard in the VAX machines, that will require some more digging.
With an exponent size of 8 bits, a 23-bit mantissa is necessary to fill out a 32-bit representation using the 754 format. If you want to look into this further, there are actually other important size decisions to be made when choosing a representation. For example, the number of guard bits, usually 2, is actually another part of the representation that is not standard across processors. Finally, if you haven't read it, this article gives some great standard reading into details of the floating-point representation and history.

- 1,074
- 9
- 16
These numbers are for single precision floating point numbers, which is represented with 32 bits. One bit is for sign, so it is 31 bits left to share for mantissa and exponent. How these 31 bits are split between mantissa and exponent, is a compromise between accuracy (mantissa) and total range (exponent). More bits for mantissa gives higher accuracy, but smaller range and vice versa.
I think 23 bits in mantissa and 8 bits in exponent is a reasonable compromise between accuracy and range for many practical applications.

- 1,407
- 1
- 13
- 14