Why does float output 632 at the end? (Y = 1399109632.000000), while double does not change the number Y = 1399109568.000000

Question

#include <stdio.h>
#include <conio.h>
#include <string.h>

int main()
{
    float Y = 1399109568;
    int D;
    printf("Y=%f", Y);
    memcpy(&D, &Y, 4);
    //printf("D=%d", D);
    return 0;
}

It looks like `float` in your environment doesn't have enough precision to hold `1399109568`. — MikeCAT, May 14 '21 at 10:31
Does [this](https://stackoverflow.com/questions/5098558/float-vs-double-precision) answer your question? — alagner, May 14 '21 at 15:30
Does this answer your question? ['float' vs. 'double' precision](https://stackoverflow.com/questions/5098558/float-vs-double-precision) — alagner, May 14 '21 at 15:30

score 1 · Answer 1 · answered May 14 '21 at 10:53

In the format commonly used for float, a number is represented as:

a sign (+ or −),
an integer f from 0 to 16,777,215 (2²⁴−1), and
an exponent e of 2 from −149 to +104.

So, if we let s be +1 or −1 to represent the sign, the value represented is s•f•2^e.

(Often, the representation is described with f as a binary numeral starting with “0.” or “1.” and followed by 23 bits and e ranging from −126 to +127. These describe the same set of numbers, and which form is used is chosen based on convenience.)

For your number 1,399,109,568, we need e to be at least 7. Otherwise, using the largest f with an e of 6, the largest value we can represent is 16,777,215•2⁶ = 1,073,741,760.

With an exponent of 7, we can represent 1,399,109,504 as 10,930,543•2⁷, and we can represent 1,399,109,632 as 10,930,544•2⁷. So those are our two candidates for f, 10,930,543 and 10,930,544. They are adjacent integers, so we cannot choose an f in between them and get closer to 1,399,109,568. The closest we can get is 1,399,109,504 and 1,399,109,632.

Each of those differs from 1,399,109,568 by 64 in one direction or the other. Since they are equidistant, the usual default rounding rule chooses the one with the even f, 10,930,544. So, converting 1,399,109,568 to float yields 1,399,109,632.

The commonly used format for double has more precision. It represents a number as:

a sign (+ or −),
an integer f from 0 to 9,007,199,254,740,991 (2⁵³−1), and
an exponent e of 2 from −1,074 to +971.

So it can represent 1,399,109,568 exactly.

Why does float output 632 at the end? (Y = 1399109632.000000), while double does not change the number Y = 1399109568.000000

1 Answers1