Why does a C floating-point type modify the actual input of 125.1 to 125.099998 on output?

Question

I wrote the following program:

 #include<stdio.h>
    int main(void)
    {
     float f;
     printf("\nInput a floating-point no.: ");
     scanf("%f",&f);
     printf("\nOutput: %f\n",f);
     return 0;
    }

I am on Ubuntu and used GCC to compile the above program. Here is my sample run and output I want to inquire about:

Input a floating-point no.: 125.1
Output: 125.099998

Why does the precision change?

More information: http://stackoverflow.com/questions/2100490/floating-point-inaccuracy-examples — Steve Jessop, Jun 30 '11 at 09:45
possible duplicate of [Why is this C program giving a 'wrong' output?](http://stackoverflow.com/questions/2808478/why-is-this-c-program-giving-a-wrong-output) — outis, Jun 30 '11 at 11:55

score 12 · Answer 1 · answered Jun 30 '11 at 09:10

Because the number 125.1 is impossible to represent exactly with floating-point numbers. This happens in most programming languages. Use e.g. printf("%.1f", f); if you want to print the number with one decimal, but be warned: the number itself is not exactly equal to 125.1.

Harsh Vardhan · Accepted Answer · 2011-07-24T17:33:57.903

Thank you all for your answers. Although almost all of you helped me look in the right direction I could not understand the exact reason for this behavior. So I did a bit of research in addition to reading the pages you guys pointed me to. Here is my understanding for this behavior:

Single Precision Floating Point numbers typically use 4 bytes for storage on x86/x86-64 architectures. However not all 32 bits (4 bytes = 32 bits) are used to store the magnitude of the number.

For storing as a single precision floating type, the input stream is formatted in the following notation (somewhat similar to scientific notation):

(-1)^s x 1.m x 2^(e-127), where
  s = sign of the number, range:{0,1} - takes up 1 bit
  m = mantissa (fractional portion) of the number - takes up 23 bits
  e = exponent of the number offset by 127, range:{0,..,255} - takes up 8 bits

and then stored in memory as

0th byte 1st byte 2nd byte 3rd byte
mmmmmmmm mmmmmmmm emmmmmmm seeeeeee

Therefore the decimal number 125.1 is first converted to binary form but limited to 24 bits so that the mantissa is represented by no more than 23 bits. After conversion to binary form:

125.1 = 1111101.00011001100110011

NOTE: 0.1 in decimal can be represented up to infinite bits in binary but the computer limits the representation to 17 bits so the complete representation does not exceed 24 bits.

Now converting it into the specified notation we get:

125.1 = 1.111101 00011001100110011 x 2^6
      = (-1)^0 + 1.111101 00011001100110011 x 2^(133-127)

which implies

s = 0
m = 11110100011001100110011
e = 133 = 10000101

Therefore, 125.1 will be stored in memory as:

0th byte 1st byte 2nd byte 3rd byte
mmmmmmmm mmmmmmmm emmmmmmm seeeeeee
00110011 00110011 11111010 01000010

On being passed to the printf() function the output stream is generated by converting the binary form to the decimal form. The bytes are actually stored in reverse order (from the input stream) and hence read in this order:

3rd byte 2nd byte 1st byte 0th byte
seeeeeee emmmmmmm mmmmmmmm mmmmmmmm
01000010 11111010 00110011 00110011

Next, it is converted into the specific notation for conversion

(-1)^0 + 1.111101 00011001100110011 x 2^(133-127)

On simplifying the above representation further:

= 1.111101 00011001100110011 x 2^6
= 1111101.00011001100110011

And finally converting it to decimal:

= 125.0999984741210938

~~but single precision floating point can represent only up to 6 decimal places, therefore~~ the answer is rounded off to 125.099998.

This is exactly right up until the last few paragraphs. The rounding in `printf` is not because "single precision floating point can represent only up to 6 decimal places" -- in fact, your value is converted to `double` before being passed to `printf`, which doesn't change it's value, but does mean that `printf` has no way to know that it was originally a single-precision value or use that information to inform the rounding. — Stephen Canon, Jun 30 '11 at 23:47
you explained yourself the thing the best. Differently speaking, the fraction 0.1 can't be written as a finite sum of negative powers of 2. This can happen in general when converting from a base to another using a finite number of digits (i.e. truncating or rounding at some point). As you may have noticed, the mathematically exact representation of the base 10 number 125.1 is, in base 2, 1111101.0(0011) where with (0011) I have indicated the repeating part of the binary number. So there's no way to store exactly 125.1 this way, you can just get near and leave the rest to the rounding — ShinTakezou, Jul 01 '11 at 16:48
Thank you @Stephen for pointing out my incorrect assumption. And thank you Shin for summarizing the limitations of base conversion even better. Could any of you also explain to me how floating-points are rounded by the printf()? — Harsh Vardhan, Jul 24 '11 at 17:39

score 2 · Answer 3 · edited Jun 30 '11 at 18:44

Think about a fixed point representation first.

2^3=8 2^2=4 2^1=2 2^0=1 2^-1=1/2 2^-2=1/4 2^-3=1/8 2^-4=1/16

If we want to represent a fraction then we set the bits to the right of the point, so 5.5 is represented as 01011000.

But if we want to represent 5.6, there is not an exact fractional representation. The closest we can get is 01011001 == 5.5625

1/2 + 1/16 = 0.5625

2^-4 + 2^-1

score 0 · Answer 4 · answered Jun 30 '11 at 09:11

0

Because its the closest representation of 125.1 , remember that single precision floating point are just 32 bits.

answered Jun 30 '11 at 09:11

user822715

27
2

score 0 · Answer 5 · answered Jun 30 '11 at 09:14

If I tell you to write 1/3 as decimal number down, you realize there a numbers which have no finite representation. .1 is the exact representation of 1/10 there this problem does not appear, BUT this is just in decimal representation. In binary representation .1 is one of those numbers that require infinite digits. As your number must be somehwere cut there is something lost.

Alexander · Answer 6 · 2011-06-30T09:25:28.610

0

No floating point numbers has an exact representation, they all have limited accuracy. When converting from a number in text to a float (with scanf or otherwise), you're in another world with different kinds of numbers, and precision may be lost. Same thing goes when converting from a float to a string: you decide on how many digits you want. You can't know "how many digits there are" in a float before converting to text or another format that can keep that information. This all has to do with how floats are stored:

significant_digits * base^exponent

edited Jun 30 '11 at 09:25

answered Jun 30 '11 at 09:17

Alexander

9,737
4
53
59

indeed there are fp numbers that have an exact representation as decimal numbers; e.g. 0.5 should cause no such problems as x.1 – ShinTakezou Jun 30 '11 at 11:35
ShinTakezou, you are right, some numbers are within the precision of floating point numbers, and will stay the same after the roundtrip from string to float to string. However, `printf('%f', 0.5)` prints out `0.500000`, which is the same value, but a different string, due to the float representation being different. This is relevant, as the OP is dealing with numbers as strings and then as float. – Alexander Jun 30 '11 at 11:51
hm, I was not talking about string representation but about numbers that can be written as a finite sum of term like A*base^N. The point was that IEEE fp encodes the "fractional" part as binary number. A non-irrational and non-periodic number in base 10 (the kind of number we always deal with when using a finite-long string) can't always have a finite expansion in term of base 2, i.e. not always it can be written as a finite sum of (negative and positive) powers of 2. Truncating the sum, you obtain a number which is the nearest possible according to the given "precision". – ShinTakezou Jun 30 '11 at 17:00
of course, as result of computation like 1.0/3.0, we "have" periodic number that has no finite expansion in base 10, so you truncate it (your string has finite length) as "0.333333", and then you try to convert it into binary based fp number to store it into a float/double whatever... the number "0.333333" is rational and not-periodic, though its base 2 expansion could need an infinite number of bits – ShinTakezou Jun 30 '11 at 17:06
Even if you were not talking about string representation, the question is talking about string representation. Other than that, I agree with what you say. – Alexander Jun 30 '11 at 17:32
no it is not. the number 125.1 can't be exactly represented in IEEE fp. Other floating point "choices" could store this value exactly, without losing information. So the problem is not the string representing the number. The problem is the number itself, not the way you input it, and the way it is represented *internally* (i.e. using a base 2 specificand... which is the natural choice for a digital computer, but it is not necessary the best for all the cases, e.g. financial mathematics can't use reliably this "kind" of fp numbers). – ShinTakezou Jun 30 '11 at 21:30
The question is specifically about converting from a string, to a float and back to a string, and why the precision changes. It changes because of the lacking accuracy of the float. Agree? – Alexander Jul 01 '11 at 07:56
Yes and no. I was just stressing the fact that the convertion (from string) is not the real problem. The real problem is that 125.1 can't be exactly represented with a finite number of digits in base 2, and IEEE fp is base 2. This won't change, reading with scanf and printing with printf do not create the problem, since it is that the stored number can't be the decimal number 125.1, no matter how you show it (just to get it clear, it's better to go and show the bits and do a bit of math, as the OP himself did) – ShinTakezou Jul 01 '11 at 16:52
@Alexander let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/1061/discussion-between-shintakezou-and-alexander) – ShinTakezou Jul 01 '11 at 16:52

score -1 · Answer 7 · answered Jun 30 '11 at 09:25

-1

The normal type used for floating point in C is double, not float. Your float is implicitly cast to a double, and because the float is less precise, the difference to the closest representable number to 125.1 is more apparent (and printf's default precision is tailored for use with doubles). Try this instead:

#include<stdio.h>
int main(void)
{
    double f;
    printf("\nInput a floating-point no.: ");
    scanf("%lf",&f);
    printf("\nOutput: %f\n",f);
    return 0;
}

answered Jun 30 '11 at 09:25

ysth

96,171
6
121
214

This is correct, but has little to do with the problem the questioner is asking about. – Stephen Canon Jun 30 '11 at 18:36
I disagree; all the other answers are true but not helpful. The questioner's problem is that the printf's default precision is intended to deal with the difference between an exact value and its nearest representable double value, and it is being given a float (implicitly typecast to a double) instead, which has a much greater difference. – ysth Jun 30 '11 at 20:00
I would argue that the questioner's problem is that he expects a value that is not representable in floating-point (neither as a `float` nor as a `double`) to be preserved by decimal to binary conversion. – Stephen Canon Jun 30 '11 at 20:26
Thank you for the answer @ysth. I wanted to know why the value was not preserved on decimal to binary conversion. However, your answer did prompt me to consider if it is typecasting that I have not understood. – Harsh Vardhan Jun 30 '11 at 21:03

Why does a C floating-point type modify the actual input of 125.1 to 125.099998 on output?

7 Answers7

Linked