Encoding int value as an IEEE-754 float (binary32)

Question

Given the 32 bits that represent an IEEE 754 floating-point number, how can the number be converted to an integer, using integer or bit operations on the representation (rather than using a machine instruction or compiler operation to convert)?

I have the following function but it fails in some cases:

Input: int x (contains 32 bit single precision number in IEEE 754 format)

  if(x == 0) return x;

  unsigned int signBit = 0;
  unsigned int absX = (unsigned int)x;
  if (x < 0)
  {
      signBit = 0x80000000u;
      absX = (unsigned int)-x;
  }

  unsigned int exponent = 158;
  while ((absX & 0x80000000) == 0)
  {
      exponent--;
      absX <<= 1;
  }

  unsigned int mantissa = absX >> 8;

  unsigned int result = signBit | (exponent << 23) | (mantissa & 0x7fffff);
  printf("\nfor x: %x, result: %x",x,result);
  return result;

This don't cast a float into an int. It just copy bitwise their machine representation, without e.g. converting `2.03e1` to `20` [by rounding] as the `(int)2.03e1` cast will. — Basile Starynkevitch, Sep 09 '12 at 20:59
You *want* do do it bitwise? Well, that's how you do it bitwise - it just reinterprets the bytes. No steps, really. — Ry-, Sep 09 '12 at 21:00
But 0x7eff8965 = 1325268755 (after casting). If you use the HEX in IEEE 754 Calc, you get 1.6983327e+38 and HEX to decimal gives: 2130676069 - none of them give the correct result of 1325268755. — Anonymous, Sep 09 '12 at 21:04
This code has undefined behavior in C. See section 6.5 in the standard. — Paul Hankin, Sep 09 '12 at 21:30
Is your question this: Given the 32 bits that represent a float x, how can the conversion `(int) x` be implemented, using integer/bit operations on the representation (rather than using a machine instruction to convert floating-point to integer)? — Eric Postpischil, Sep 09 '12 at 21:40
There's another related question by Anon: [Negate Floating Number in C](http://stackoverflow.com/questions/12336675/negate-floating-number-in-c-fails-in-some-cases), also about bitwise manipulation of IEEE 754 values. There was a second related question in the last 24 hours or so: [How to manually (bitwise) perform `(float)x`](http://stackoverflow.com/questions/12336314/how-to-manually-bitwise-perform-floatx). — Jonathan Leffler, Sep 09 '12 at 22:19
Indeed, http://stackoverflow.com/questions/12336314/how-to-manually-bitwise-perform-floatx Having the same problem, it doesn't want to round correctly... very frustrating — , Sep 10 '12 at 01:08
@Silver - yes! I still have to work on float_times_four. That is time consuming too! — Anonymous, Sep 10 '12 at 01:10
For float_times_four, you want to separate it into a bunch of cases (is NaN, is zero, is infinity, is normal, is denormalized (that last one was the part that took me a while)) — , Sep 10 '12 at 01:24
I have already done that. If less than 0x0071FFFF then just return uf*4, else just add to mantissa. But I am not sure what to do when both exponent and mantissa have to be changed. Also, are you converting the number and doing multiplication, or just manipulating bits in the IEEE form? — Anonymous, Sep 10 '12 at 01:27
I asked whether you were trying to convert a float (given its representation) to an int, and you answered yes. But your code looks like you are trying to convert an int to a float. Which is it? (The latter is addressed [here](http://stackoverflow.com/questions/12336314/how-to-manually-bitwise-perform-floatx).) — Eric Postpischil, Sep 10 '12 at 02:06
BTW, the code you posted converts a 32-bit signed int to its 32-bit IEEE 754 single-precision with rounding toward zero. I know because I wrote it yesterday. — Gabe, Sep 10 '12 at 04:45
duplicates: [How to manually (bitwise) perform (float)x?](https://stackoverflow.com/q/12336314/995714), [Converting Int to Float or Float to Int using Bitwise operations (software floating point)](https://stackoverflow.com/q/20302904/995714), [How to convert an unsigned int to a float?](https://stackoverflow.com/q/19529356/995714) — phuclv, May 15 '19 at 04:14

score 24 · Answer 1 · answered Nov 22 '13 at 18:23

24

C has the "union" to handle this type of view of data:

typedef union {
  int i;
  float f;
 } u;
 u u1;
 u1.f = 45.6789;
 /* now u1.i refers to the int version of the float */
 printf("%d",u1.i);

answered Nov 22 '13 at 18:23

Carl

257
3
2

3

This is undefined behavior in every C standard I know of. – TLW Jul 26 '16 at 22:14
19

@TLW Type punning through union is not UB since C99. This is explictly mentioned in, for example, N1256 6.5.2.3 footnote 82. – user694733 Feb 13 '17 at 10:15
Even in C99 it explicitly allows a trap representation. And using a trap representation is UB. I believe it would be legal for a compiler to unconditionally compile this code into `format_hard_drive();` as a result. – TLW Oct 02 '22 at 19:33
@TLW: No, that would not conform to the C standard. Using the value of a union member reinterprets the bytes as the type of the member. It cannot just produce any value; it must be the value determined by the representation scheme for the type. If that does result in a trap value, the behavior is not defined by the C standard. But the `int` type does not have any trap representations in most current C implementations. – Eric Postpischil Dec 20 '22 at 13:09
@TLW: You might be mixing this with an *indeterminate value* which occurs when an object is not initialized. If a value is indeterminate, the C implementation may behave as if it has any value of the type (each time it is used), rather than being required to interpret whatever value is in the bytes of its memory. But still, it must be a value of the type. If there are no trap values in the type, use of an indeterminate object may not trap, absent other issues. – Eric Postpischil Dec 20 '22 at 13:12
The only type that the C(99) standard guarantees not to have a trap representation is `unsigned char` (6.2.6.2.1) - which this code does not use. I believe my point stands. – TLW Dec 21 '22 at 02:56
And yes, integer types can have trap representations. (Aside from the aforementioned `unsigned char`.) – TLW Dec 21 '22 at 02:56

score 20 · Answer 2 · edited May 26 '16 at 23:50

20

&x gives the address of x so has float* type.

(int*)&x cast that pointer to a pointer to int ie to a int* thing.

*(int*)&x dereference that pointer into an int value. It won't do what you believe on machines where int and float have different sizes.

And there could be endianness issues.

This solution was used in the fast inverse square root algorithm.

edited May 26 '16 at 23:50

mihaib

15
4

answered Sep 09 '12 at 21:01

Basile Starynkevitch

223,805
18
296
547

So you are saying that the code just gets the location of x and prints it out? In that case, the value would change on each run. – Anonymous Sep 09 '12 at 21:09
1

No it gives the integer contained at the location of the float, so, when `sizeof(int) == sizeof[float]` it gives the `int` of the same machine bit representation as your `x` ; nothing is printed unless you call a printing routine like `printf` (which is not in your question) – Basile Starynkevitch Sep 09 '12 at 21:18
Ok, so it gives the value stored at the location in memory and casts it to an int type. How can I do this without casting? – Anonymous Sep 09 '12 at 21:27
@BasileStarynkevitch: what would the problem with endianness be? If you are just looking to pick out the bits of a float, I don't think it would matter if ints are stored big- or little-endian. – Björn Lindqvist Jan 21 '18 at 16:00
Endianness would be a problem if you were converting a float to an unsigned int, where you are using the bits as flags and the sending function/program/device can only send floats. – Mark Walsh Jan 15 '19 at 20:37
It also violates the strict aliasing rule – palapapa Nov 19 '22 at 09:27
1

This is undefined beaviour. Use the union-version to be safe. – Kai Dec 16 '22 at 13:50

score 8 · Answer 3 · answered Nov 23 '18 at 14:00

8

// With the proviso that your compiler implementation uses
// the same number of bytes for an int as for a float:
// example float
float f = 1.234f;
// get address of float, cast as pointer to int, reference
int i = *((int *)&f);
// get address of int, cast as pointer to float, reference
float g = *((float *)&i);
printf("%f %f %08x\n",f,g,i);

answered Nov 23 '18 at 14:00

Dino Dini

433
3
6

This doesn't add anything to the previous answers. – Matthieu Brucher Nov 23 '18 at 14:36
4

I do not agree. It's a nice self contained example, Mr. Brucher with a reputation of 6,923 – Dino Dini Nov 23 '18 at 15:40
2

Doesn't this violate the strict aliasing rule? – palapapa Nov 19 '22 at 09:25

wildplasser · Answer 4 · 2012-09-09T21:53:37.087

6

float x = 43.133;
int y;

assert (sizeof x == sizeof y);
memcpy (&y, &x, sizeof x);
...

edited Sep 09 '12 at 21:53

answered Sep 09 '12 at 21:28

wildplasser

43,142
8
66
109

memcpy did not work. int x (contains 32 bit float) is the input, then int result; memcpy(&result, &x, 4) does not work. (4 is ok as it will only run on 32bit machines) – Anonymous Sep 09 '12 at 21:37
Maybe your assert (or your sizeof) is broke? BTW:Oops, I should have used x instead of f. BRB. – wildplasser Sep 09 '12 at 21:53

Johan Köhler · Answer 5 · 2016-04-19T18:28:00.850

0

You can cast the float using a reference. A cast like this should never generate any code.

C++

float f = 1.0f;
int i = (int &)f;
printf("Float %f is 0x%08x\n", f, i);

Output:

Float 1.000000 is 0x3f800000

If you want c++ style cast use a reinterpret_cast, like this.

int i = reinterpret_cast<int &>(f);

It does not work with expressions, you have to store it in a variable.

    int i_times_two;
    float f_times_two = f * 2.0f;
    i_times_two = (int &)f_times_two;

    i_times_two = (int &)(f * 2.0f);
main.cpp:25:13: error: C-style cast from rvalue to reference type 'int &'

edited Apr 19 '16 at 18:28

answered Aug 11 '15 at 20:41

Johan Köhler

49
2

Please add some informtion about how your code works – Koopakiller Aug 11 '15 at 21:01
Your second example will work if you use a rvalue reference, replace ``(int&)`` with ``(int&&)``. This is required as the expression returns an rvalue reference which lvalue references cannot bind to. I assume you could also use ``(const int &)`` to bind to both. – nitronoid Aug 19 '17 at 10:34

score -1 · Answer 6 · answered Sep 09 '12 at 21:00

-1

You cannot (meaningfully) convert a floating point number into an 'integer' (signed int or int) in this way.

It may end up having the integer type, but it's actually just an index into the encoding space of IEEE754, not a meaningful value in itself.

You might argue that an unsigned int serves dual purpose as a bit pattern and an integer value, but int does not.

Also there are platform issues with bit manipulation of signed ints.

answered Sep 09 '12 at 21:00

Alex Brown

41,819
10
94
108

7

Meaningful use: receiving two `int16_t`s on a bus, that actually represent a `float32`. Reinterpret the two `int16_t` as a float. – Gauthier Mar 30 '15 at 10:46

user19750105 · Answer 7 · 2022-08-12T10:39:03.360

Multiply float number a factor you want. In this case I multiplied with 100,000, because 5 decimals after fraction is have meaning in my operation.

Convert it to bytes and than join them and divide by 100,000 again.

double angleX, angleY;
angleX = 3.2342;
angleY = 1.34256;
printf("%f, %f", (double)angleX, (double)angleY);

int remain, j;
int TxData[8];
j=0;
remain=0;
unsigned long data = angleX*100000;
printf("\ndata : %d\n", data);

while(data>=256)
{
    remain= data%256;
    data = data/256;
    TxData[j]= remain;
    printf("\ntxData %d : %d", j, TxData[j]);
    j++;
}
TxData[j] = data;
printf("\ntxData %d : %d", j, TxData[j]);

int i=0;
long int angleSon=0;
for(i=0;i<=j;i++)
{

    angleSon += pow(256,i)*TxData[i]; 
    printf("\nangleSon : %li", angleSon);
}

Encoding int value as an IEEE-754 float (binary32)

7 Answers7

Linked