Error on casting unsigned int to float

Question

For the following program.

#include <stdio.h>

int main()
{
    unsigned int a = 10;
    unsigned int b = 20;
    unsigned int c = 30;
    float d = -((a*b)*(c/3));
    printf("d = %f\n", d);
    return 0;
}

It is very strange that output is

d = 4294965248.000000

When I change the magic number 3 in the expression to calculate d to 3.0, I got correct result:

d = 2000.000000

If I change the type of a, b, c to int, I also got correct result.

I guess this error occurred by the conversion from unsigned int to float, but I do not know details about how the strange result was created.

If you use `int d`, you will get correct result. I think `float` and `double` can't accept `unsigned int` (`ULONG`). — AleXelton, Jun 14 '18 at 14:22

Soner from The Ottoman Empire · Accepted Answer · 2018-06-14T15:07:54.703

I think you realize that you casting minus to unsigned int before assignment to float. If you run the below code, you will get highly likely 4294965296

#include <stdio.h>

int main()
{
    unsigned int a = 10;
    unsigned int b = 20;
    unsigned int c = 30;
    printf("%u", -((a*b)*(c/3)));

    return 0;
}

The -2000 to the right of your equals sign is set up as a signed integer (probably 32 bits in size) and will have the hexadecimal value 0xFFFFF830. The compiler generates code to move this signed integer into your unsigned integer x which is also a 32 bit entity. The compiler assumes you only have a positive value to the right of the equals sign so it simply moves all 32 bits into x. x now has the value 0xFFFFF830 which is 4294965296 if interpreted as a positive number. But the printf format of %d says the 32 bits are to be interpreted as a signed integer so you get -2000. If you had used %u it would have printed as 4294965296.

#include <stdio.h>
#include <limits.h>

int main()
{
    float d = 4294965296;
    printf("d = %f\n\n", d);
    return 0;
}

When you convert 4294965296 to float, the number you are using is long to fit into the fraction part. Now that some precision was lost. Because of the loss, you got 4294965248.000000 as I got.

The IEEE-754 floating-point standard is a standard for representing and manipulating floating-point quantities that is followed by all modern computer systems.
bit  31 30    23 22                    0
     S  EEEEEEEE MMMMMMMMMMMMMMMMMMMMMMM
The bit numbers are counting from the least-significant bit. The first bit is the sign (0 for positive, 1 for negative). The following 8 bits are the exponent in excess-127 binary notation; this means that the binary pattern 01111111 = 127 represents an exponent of 0, 1000000 = 128, represents 1, 01111110 = 126 represents -1, and so forth. The mantissa fits in the remaining 24 bits, with its leading 1 stripped off as described above. Source

As you can see, when doing conversion 4294965296 to float, precision which is 00011000 loss occurs.

11111111111111111111100 00011000 0  <-- 4294965296
11111111111111111111100 00000000 0  <-- 4294965248

͏+͏1͏͏͏͏͏͏͏͏͏͏͏͏͏ this is a helpful analysis, although I would mention that IEEE754 is not compulsory although highly probable. — Bathsheba, Jun 14 '18 at 15:14
@snr, thanks for your detailed explanation. Is there any gcc options can give a warning to this dangerous conversion? — gzh, Jun 15 '18 at 04:42
@gzh I don't know about any warnings. But this is not a dangerous conversion. It is the way the arithmetics work. It is always important to consider in which data type the calculation is done. A famous example `1/2*2` the result is `0` because `1/2`evaluates to `0` — Kami Kaze, Jun 15 '18 at 06:11
@gzh I'm using CLion on macOS however same thing is true of ubuntu highly likely. Glance at the image --> https://i.imgur.com/GS8aroJ.gif — Soner from The Ottoman Empire, Jun 15 '18 at 15:47

Kami Kaze · Answer 2 · 2018-06-15T06:07:34.023

1

Your whole calculation will be done unsigned so it is the same as

 float d = -(2000u);

-2000 in unsigned int (assuming 32bits int) is 4294965295

this gets written in your float d. But as float can not save this exact number it gets saved as 4294965248.

As a rule of thumb you can say that float has a precision of 7 significant base 10 digits.

What is calculated is 2^32 - 2000 and then floating point precision does the rest.

If you instead use 3.0 this changes the types in your calculation as follows

float d = -((a*b)*(c/3.0));
float d = -((unsigned*unsigned)*(unsigned/double));
float d = -((unsigned)*(double));
float d = -(double);

leaving you with the correct negative value.

edited Jun 15 '18 at 06:07

answered Jun 14 '18 at 14:32

Kami Kaze

2,069
15
27

Actually you don't get the correct value; there is probably some rounding to `float`. – Bathsheba Jun 14 '18 at 15:08
@Bathsheba I think `-2000` should not be a problem in single precision floating point. – Kami Kaze Jun 14 '18 at 18:34
indeed but don’t you see that one of the issues is an negation of an unsigned value? – Bathsheba Jun 14 '18 at 18:42
@Bathsheba I do not understand what you mean. The answer has two parts. The first handles the unsigned part with the problem of the negation. The second part argues what happens with a double present in the formula. The double will make the result in the paranthesis double making a negation possible without any side effects. After that the result (-2000.0) gets assigned to `float d` which should have no problem handling that without error. – Kami Kaze Jun 14 '18 at 18:48

Conradin · Answer 3 · 2018-06-14T14:54:59.600

1

This is because you use - on an unsigned int. The - inverts the bits of the number. Lets print some unsigned integers:

printf("Positive: %u\n", 2000);
printf("Negative: %u\n", -2000);

// Output:
// Positive: 2000
// Negative: 4294965296

Lets print the hex values:

printf("Positive: %x\n", 2000);
printf("Negative: %x\n", -2000);

// Output
// Positive: 7d0
// Negative: fffff830

As you can see, the bits are inverted. So the problem comes from using - on unsigned int, not from casting unsigned intto float.

edited Jun 14 '18 at 14:54

answered Jun 14 '18 at 14:39

Conradin

180
6

What kind of universe has a 2's complement `unsigned` type ?! – Bathsheba Jun 14 '18 at 14:45
2

`-` does not inverts the bits of the number. `~` inverts the bits. – chux - Reinstate Monica Jun 14 '18 at 15:24
Thanks. The reason is applying `-` on an `unsigned int`, have nothing to do with casting unsigned int to float. Maybe I should change the subject of this question. – gzh Jun 15 '18 at 04:50

score 1 · Answer 4 · answered Jun 14 '18 at 15:25

As others have said, the issue is that you are trying to negate an unsigned number. Most of the solutions already given have you do some form of casting to float such that the arithmetic is done on floating point types. An alternate solution would be to cast the results of your arithmetic to int and then negate, that way the arithmetic operations will be done on integral types, which may or may not be preferable, depending on your actual use-case:

#include <stdio.h>

int main(void)
{
    unsigned int a = 10;
    unsigned int b = 20;
    unsigned int c = 30;
    float d = -(int)((a*b)*(c/3));
    printf("d = %f\n", d);
    return 0;
}

Bathsheba · Answer 5 · 2018-06-14T15:32:10.690

0

-((a*b)*(c/3)); is all performed in unsigned integer arithmetic, including the unary negation. Unary negation is well-defined for an unsigned type: mathematically the result is modulo 2^N where N is the number of bits in unsigned int. When you assign that large number to the float, you encounter some loss of precision; the result, due to its binary magnitude, is the nearest number to the unsigned int that divides 2048.

If you change 3 to 3.0, then c / 3.0 is a double type, and the result of a * b is therefore converted to a double before being multiplied. This double is then assigned to a float, with the precision loss already observed.

edited Jun 14 '18 at 15:32

answered Jun 14 '18 at 14:39

Bathsheba

231,907
34
361
483

"precision loss" applicability is unclear in OP's case. – chux - Reinstate Monica Jun 14 '18 at 15:28
@chux: It's hard to say what kind of loss it would be beyond our knowing what the intermediate result is (we know it's a 32 bit unsigned), and the scheme for the `float` on the platform (although we can conject it's IEEE754 due to the observed value). Which is why I was wooly in the answer. But I have now pointed out a property of the observed result. – Bathsheba Jun 14 '18 at 15:29

score 0 · Answer 6 · answered Jun 14 '18 at 14:51

0

you need to cast the ints to floats

 float d = -((a*b)*(c/3));

to

float d = -(((float)a*(float)b)*((float)c/3.0));

answered Jun 14 '18 at 14:51

Bing Bang

524
7
16

2

Actually `-((a*b)*(c/3.f));` is sufficient. – Bathsheba Jun 14 '18 at 15:00
Ah, yes. As long as there's one float everything else will be casted. I like to do it this way to remind myself what's really going on. – Bing Bang Jun 14 '18 at 15:08
1

casted -> converted. – Bathsheba Jun 14 '18 at 15:08

Error on casting unsigned int to float

6 Answers6