Floating Point to Binary Value(C++)

Question

I want to take a floating point number in C++, like 2.25125, and a int array filled with the binary value that is used to store the float in memory (IEEE 754).

So I could take a number, and end up with a int num[16] array with the binary value of the float: num[0] would be 1 num[1] would be 1 num[2] would be 0 num[3] would be 1 and so on...

Putting an int into an array isn't difficult, just the process of getting the binary value of a float is where I'm stuck. Can you just read the binary in the memory that the float variable? If not, how could I go about doing this in C++?

EDIT: The reason for doing the comparison this way is that I am wanting to learn to do bitwise operations in C++.

Out of curiosity - why do you need one integer per bit? – Jan 23 '09 at 19:30 — , Jan 23 '09 at 19:30

score 32 · Accepted Answer · edited Oct 04 '19 at 06:18

32

Use union and bitset:

#include <iostream>
#include <bitset>
#include <climits>

int main()
{
    union
    {
        float input; // assumes sizeof(float) == sizeof(int)
        int   output;
    } data;

    data.input = 2.25125;

    std::bitset<sizeof(float) * CHAR_BIT> bits(data.output);
    std::cout << bits << std::endl;

    // or
    std::cout << "BIT 4: " << bits[4] << std::endl;
    std::cout << "BIT 7: " << bits[7] << std::endl;
}

It may not be an array but you can access bits with [] operator as if you were using an array.

Output

$ ./bits
01000000000100000001010001111011
BIT 4: 1
BIT 7: 0

edited Oct 04 '19 at 06:18

tambre

4,625
4
42
55

answered Jan 23 '09 at 19:01

Martin York

257,169
86
333
562

ieee754 floats are always 32 bits, c++ is spec'ed to use ieee754 for it floating point types. Long is also spec'ed to be 32 bits. Change the union to use long instead of int, and you'll have truly portable code. – deft_code Feb 11 '09 at 17:50
9

@deft_code: C++ is **NOT** spec'ed to use ieee754 (it can be). Long is **NOT** spec'ed as 32 bits (it must be at least 32). This will never be portable as assigning to one field in a union and reading from another is unspecified behavior. If I am incorrect about either of the above please let me know the clause in the C++ standards where it is defined because a simple search showed both statements as wrong. – Martin York May 04 '13 at 00:22
@deft_code not only that, but it's also false that "ieee754 floats are always 32 bits". Re-read the standard and note the 3 types specified there, then consider deleting your comment already. – underscore_d Nov 14 '15 at 23:35
This is UB. Please don't ever do this. – DexterHaxxor May 19 '20 at 11:21
@MichalŠtein Its **implementation** defined behavior. This technique is a heavily used in C code and for backwards compatibility (a very important part of C++ consideration when new features are designed) needs to work in C++. – Martin York May 20 '20 at 01:14
@MatrinYork It's UB in C++. – DexterHaxxor Jun 06 '20 at 16:27
@MichalŠtein What clause in the standard are you using to make that claim? – Martin York Jun 06 '20 at 16:39

Mehrdad Afshari · Answer 2 · 2009-01-23T21:44:24.990

15

int fl = *(int*)&floatVar; //assuming sizeof(int) = sizeof(float)

int binaryRepresentation[sizeof(float) * 8];

for (int i = 0; i < sizeof(float) * 8; ++i)
    binaryRepresentation[i] = ((1 << i) & fl) != 0 ? 1 : 0;

Explanation

(1 << i) shifts the value 1, i bits to the left. The & operator computes the bitwise and of the operands.

The for loop runs once for each of the 32 bits in the float. Each time, i will be the number of the bit we want to extract the value from. We compute the bitwise and of the number and 1 << i:

Assume the number is: 1001011, and i = 2

1<<i will be equal to 0000100

  10001011
& 00000100
==========
  00000000

if i = 3 then:

  10001011
& 00001000
==========
  00001000

Basically, the result will be a number with ith bit set to the ith bit of the original number and all other bits are zero. The result will be either zero, which means the ith bit in the original number was zero or nonzero, which means the actual number had the ith bit equal to 1.

edited Jan 23 '09 at 21:44

answered Jan 23 '09 at 18:47

Mehrdad Afshari

414,610
91
852
789

That's not what he wants: The binary representation must be an array of size `sizeof(float) * CHAR_BIT` (-1) – Christoph Jan 23 '09 at 19:03
@Christoph: I doubt so. Look at the question. He says he wants a binary representation of the float in an int array. – Mehrdad Afshari Jan 23 '09 at 19:16
continued: To quote from the question: "So I could take a number, and end up with a int num[16] array with the binary value of the float: num[0] would be 1 num[1] would be 1 num[2] would be 0 num[3] would be 1 and so on..." – Mehrdad Afshari Jan 23 '09 at 19:17
1

He wants the int array to contain the bit pattern, ie one int for each bit - therefore, its size must be the number of bits in a float variable, ie 32 (he incorrectly assumed that a float value takes 16 bits...) – Christoph Jan 23 '09 at 19:23
Yes, what I am doing is seeing if two floats are the same without using comparison operators. So I will XOR each pair of bits (a bit from each float) and put the result in a bool variable. After comparing all the bits, if my bool value is still false, then the two numbers are the same. – user58389 Jan 23 '09 at 19:52
Also, I now see I need 32 bits, not 16. – user58389 Jan 23 '09 at 19:54
Do not assume there are 8 bits in a byte. Use CHAR_BIT. – Martin York Jan 23 '09 at 20:02
@unknown (yahoo): That's silly. It does not buy you anything. Assuming this homework: Put each float in an int do an xor on the ints. – Martin York Jan 23 '09 at 20:08
I think the number of programmers left in the world who deal with CHAR_BIT as a necessity could be counted on one hand... (as of 2007 I am no longer part of that crowd) – user7116 Jan 23 '09 at 20:16
sixlettervariables. that's just silly... it's part of the language spec and it's the amount of bits in char. how about omitting the use of sizeof next... – Johannes Schaub - litb Jan 23 '09 at 20:19
I was pointing out the unhelpfulness of the CHAR_BIT side conversation to his original homework^W question... – user7116 Jan 23 '09 at 20:23
This isn't homework, one of my professors asked me if I could do it for fun. It would be very simple to just subtract one float from another, and if you do not have 0, then they are not the same. But I think this is simply an exercise in working with bits and binary logic. – user58389 Jan 23 '09 at 20:28
What's the kind of fun that professors ask for? It gives out some increase in GPA? – Mehrdad Afshari Jan 23 '09 at 20:30
Can you explain what is happening here: ((1 << i) & fl) != 0 ? 1 : 0; – user58389 Jan 23 '09 at 21:02
1

Mehrdad, any reason for using the pretty much deprecated C-style cast instead of the recommended `reinterpret_cast` here? There's pretty much consensus that C-style cast should never be used – especially not in a “textbook” example. – Konrad Rudolph Jan 23 '09 at 21:10
2

@Konrad, It's shorter :) The sole purpose of my answer was the line in the for loop. I didn't want to clutter up the answer with unnecessary best practices. – Mehrdad Afshari Jan 23 '09 at 21:37
Thank you Mehrdad Afshari! You have been a great help. – user58389 Jan 23 '09 at 21:50

score 6 · Answer 3 · answered Mar 19 '14 at 00:49

6

other approach, using stl

#include <iostream>
#include <bitset>

using namespace std;
int main()
{
    float f=4.5f;
    cout<<bitset<sizeof f*8>(*(long unsigned int*)(&f))<<endl;
    return 0;
}

answered Mar 19 '14 at 00:49

test30

3,496
34
26

score 2 · Answer 4 · answered Jan 23 '09 at 18:49

2

If you need a particular floating point representation, you'll have to build that up semantically from the float itself, not by bit-copying.

c0x standard: http://c0x.coding-guidelines.com/5.2.4.2.2.html doesn't define the format of floating point numbers.

answered Jan 23 '09 at 18:49

Douglas Leeder

52,368
9
94
137

codekaizen · Answer 5 · 2017-06-15T18:50:04.697

2

Can you just read the binary in the memory that the float variable?

Yes. Static cast a pointer to it to an int pointer and read the bits from the result. An IEEE 754 float type in C++ is 32 bits.

edited Jun 15 '17 at 18:50

answered Jan 23 '09 at 18:50

codekaizen

26,990
7
84
140

Johannes Schaub - litb · Answer 6 · 2009-01-23T19:57:13.443

You can use an unsigned char to read the float byte by byte into the integer array:

unsigned int bits[sizeof (float) * CHAR_BIT];
unsigned char const *c = static_cast<unsigned char const*>(
    static_cast<void const*>(&my_float)
);

for(size_t i = 0; i < sizeof(float) * CHAR_BIT; i++) {
    int bitnr = i % CHAR_BIT;
    bits[i] = (*c >> bitnr) & 1;
    if(bitnr == CHAR_BIT-1)
        c++;
}

// the bits are now stored in "bits". one bit in one integer.

By the way, if you just want to compare the bits (as you comment on another answer) use memcmp:

memcmp(&float1, &float2, sizeof (float));

score 2 · Answer 7 · edited May 23 '17 at 11:46

Looking at the comments in this answer (Floating Point to Binary Value(C++)) the reason to do this is to perform a bitwise comparison of two values.

#include <iostream>

int main()
{
    union Flip
    {
         float input;   // assumes sizeof(float) == sizeof(int)
         int   output;
    };

    Flip    data1;
    Flip    data2;
    Flip    data3;

    data1.input = 2.25125;
    data2.input = 2.25126;
    data3.input = 2.25125;

    bool    test12  = data1.output ^ data2.output;
    bool    test13  = data1.output ^ data3.output;
    bool    test23  = data2.output ^ data3.output;

    std::cout << "T1(" << test12 << ") T2(" << test13 << ") T3(" << test23 << ")\n";


}

score 1 · Answer 8 · answered Jan 23 '09 at 18:48

Cast the int pointer to a float pointer, and you're done.

(Though I wouldn't declare it as an int array. I'd use void* to make it clear the the memory is being used as a dumping ground for other values.)

Incidentally, why don't you just use an array of floats?

score 1 · Answer 9 · answered Jan 23 '09 at 18:55

1

Create a union of float and and unsigned long. set the value of the float member and iterate over the bits of the unsigned long value as already described in other answers.

This will eliminate the cast operators.

answered Jan 23 '09 at 18:55

Noel Walters

1,843
1
14
20

score 1 · Answer 10 · answered Mar 04 '20 at 08:51

1

You can do it with casting pointers as well. Here is a little example

#include <iostream>
#include <bitset>

using namespace std;

int main(){
    float f = 0.3f;
    int* p = (int*)&f;
    bitset<32> bits(*p);
    cout << bits << endl;
}

answered Mar 04 '20 at 08:51

nmd_07

666
1
9
25

Jeremy Trifilo · Answer 11 · 2012-12-31T17:18:14.590

Well I don't believe C++ has any real safe way to store floats without some sort of issue. When it comes to moving between machines and is both efficient and easily stored without using a large storage capacity.

It's very accurate, but it won't support really insane values. You will be able to have up to 7 digits in any location, but you can't exceed 7 digits on either side. For the left you'll receive inaccurate results. On the right you'll get an error during read time. To resolve the error you can throw an error during the write or perform "buffer[idx++] & 0x7" on the read to prevent it from going outside 0 and 7 bounds. Keep in mind "& 0x7" only works because it's a power of 2 minus one. Which is 2^3 - 1. You can only do that with those values E.g. 0, 1, 3, 7, 15, 31, 63, 127, 255, 511, 1023, etc...

So it's up to you if you want to use this or not. I felt it was a safe way to get most values you'll ever need. The example below shows how it's converted into a 4 byte array, but for C++ this would be a char*. If you don't want to perform division you can convert the POWERS_OF_TEN array into a secondary array with decimals and multiple instead.

const float CacheReader::POWERS_OF_TEN[] = 
{
    1.0F, 10.0F, 100.0F, 1000.0F, 10000.0F, 100000.0F, 1000000.0F, 10000000.0F
};

float CacheReader::readFloat(void)
{
    int flags = readUnsignedByte();
    int value = readUnsignedTriByte();
    if (flags & 0x1)
        value = -value;
    return value / POWERS_OF_TEN[(flags >> 1) & 0x7];
}

unsigned __int32 CacheReader::readUnsignedTriByte(void)
{
    return (readUnsignedByte() << 16) | (readUnsignedByte() << 8) | (readUnsignedByte());
}

unsigned __int8 CacheReader::readUnsignedByte(void)
{
    return buffer[reader_position] & 0xFF;
}

void CacheReader::writeFloat(float data)
{
    int exponent = -1;
    float ceiling = 0.0F;

    for ( ; ++exponent < 8; )
    {
        ceiling = (POWERS_OF_TEN[exponent] * data);
        if (ceiling == (int)ceiling)
            break;
    }

    exponent = exponent << 0x1;
    int ceil = (int)ceiling;
    if (ceil < 0)
    {
        exponent |= 0x1;
        ceil = -ceil;
    }
    buffer[writer_position++] = (signed __int16)(exponent);
    buffer[writer_position++] = (signed __int16)(ceil >> 16);
    buffer[writer_position++] = (signed __int16)(ceil >> 8);
    buffer[writer_position++] = (signed __int16)(ceil);
}

score 0 · Answer 12 · answered Sep 28 '20 at 15:17

Here's my solution that doesn't give any warnings:

int32_t floatToIntBits(float f)
{
    char * c = (char*)&f;
    int32_t i = 0;
    i |= (int32_t)((c[3] << 24)     & 0xff000000);
    i |= (int32_t)((c[2] << 16)     & 0x00ff0000);
    i |= (int32_t)((c[1] << 8)      & 0x0000ff00);
    i |= (int32_t)((c[0])           & 0x000000ff);
    return i;
}

score -1 · Answer 13 · answered Jul 16 '10 at 18:52

-1

Easiest way:

float myfloat;
file.read((char*)(&myfloat),sizeof(float));

answered Jul 16 '10 at 18:52

Alexander Rafferty

6,134
4
33
55

Floating Point to Binary Value(C++)

13 Answers13

Explanation

Linked

Related