0

I am trying to convert int32 values to float and when i try to convert values above 0x0FFFFF the last decimal pointed is always rounded to nearest value. I know that when a value is not fitting to the destination float member it will be rounded but i need to know which is the limit value for this. e.g. 11111111 (0x69F6BC7) is printed as 111111112.0 .

Vikas Rokade
  • 112
  • 5
  • Can you please update the question with the code snippet you tried? – Bharat S Apr 05 '21 at 11:24
  • @BharatS: This is not a debugging question and does not require showing code. The question is clear. – Eric Postpischil Apr 05 '21 at 11:26
  • You do know that the `float` type can (mostly) only store about 7 decimal figures accurately? – Weather Vane Apr 05 '21 at 11:27
  • The problem is i am using a tool called Embedded Wizard it has its own language based on c/cpp but it just support float and not double. I have developed a numeric keypad and when user presses 1111111.11 I am storing it as 111111111 and divide by unit of digit precision(in this case it is 100). While doing that I need to convert 111111111 to float and then divide by 100.0 to store this value for user. – Vikas Rokade Apr 05 '21 at 11:37
  • Does this answer your question? [Largest odd integer that can be represented as a float](https://stackoverflow.com/questions/52267201/largest-odd-integer-that-can-be-represented-as-a-float) – phuclv Apr 05 '21 at 12:03
  • duplicates: [Which is the first integer that an IEEE 754 float is incapable of representing exactly?](https://stackoverflow.com/q/3793838/995714), [Finding the smallest integer that can not be represented as an IEEE-754 32 bit float](https://stackoverflow.com/q/3890123/995714), [Range of integers that can be expressed precisely as floats / doubles](https://stackoverflow.com/q/15643067/995714), [Smallest integer not representable in single precision floating point](https://stackoverflow.com/q/27207149/995714) – phuclv Apr 05 '21 at 12:05
  • @phuclv: None of those four are for C except the second, and that one has no answer that would be a good solution here. – Eric Postpischil Apr 05 '21 at 13:14

1 Answers1

4

The maximum integer value of a float significand is FLT_RADIX/FLT_EPSILON - 1. By “integer value” of a significand, I mean the value when it is scaled so that its lowest bit represents a value of 1.

The value FLT_RADIX/FLT_EPSILON is also representable in float, since it is a power of the radix. FLT_RADIX/FLT_EPSILON + 1 is not representable in float, so converting an integer to float might result in rounding if the integer exceeds FLT_RADIX/FLT_EPSILON in magnitude.

If it is known that INT_MAX exceeds FLT_RADIX/FLT_EPSILON, you can test this for a non-negative int x with (int) (FLT_RADIX/FLT_EPSILON) < x. If it is not known that FLT_RADIX/FLT_EPSILON can be converted to int successfully, more complicated tests may be needed.

Very commonly, C implementations use the IEEE-754 binary32 format, also known as “single precision,” for float. In this format, FLT_RADIX/FLT_EPSILON is 224 = 16,777,216.

These symbols are defined in <float.h>. For double or long double, replace FLT_EPSILON with DBL_EPSILON or LDBL_EPSILON. FLT_RADIX remains unchanged since it is the same for all formats.

Theoretically, a perverse floating-point format might have an abnormally small exponent range that makes FLT_RADIX/FLT_EPSILON - 1 not representable because the significand cannot be scaled high enough. This can be disregarded in practice.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312