Obtaining bit representation of a float in C

Question

I'm trying to use unions to obtain the bit representation of float values, my code is currently as follows:

union ufloat {
  float f;
  unsigned u;
};

int main( ) {       

   union ufloat u1;
   u1.f = 3.14159f;
   printf("u1.u : %f\n", u1.u);

However anything I try to print gets printed as 0.0000000, instead of as bits (such as 0001 0110, or something similar), what is wrong in my code?

Note that preferebly I would like to use unions to achieve this.

There is no `printf` format specifier that will print out the binary representation of any value. You'll need to extract each bit and print those out. — kaylum, Jun 17 '17 at 21:56
You're invoking **undefined behaviour** - passing an `unsigned` to `printf` to correspond to `%f`. — Oliver Charlesworth, Jun 17 '17 at 21:57
You'd need to use a hex format. Either `%X` for the integer, or perhaps `%A` for `double` (you can't pass a `float` to `printf()`; they're converted to `double` automatically). — Jonathan Leffler, Jun 17 '17 at 22:01
use `printf("%llx", (unsigned long long) u1.f)`, to see the hexadecimal representation if you're using C99 and above. — Bite Bytes, Jun 17 '17 at 22:07
@BiteBytes that casts the float instead of reinterpreting it — harold, Jun 17 '17 at 22:30
Also [How to display the encoding of a floating-point value](https://stackoverflow.com/q/17664523/2410359) is useful. — chux - Reinstate Monica, Jun 17 '17 at 22:42
@harold, indeed, what about this: `printf("%lx", *(unsigned long*) &u1.f)`, and hope that `long` and `float` has the same size. — Bite Bytes, Jun 17 '17 at 22:49
@BiteBytes: Even worse: you violate the effective type (aka strict aliasing) rule! I don't see why you insists on a cast at all! — too honest for this site, Jun 18 '17 at 00:25
Before anything else, `_Static_assert` `sizeof(float) == sizeof(unsigned)`! Better use `uint32_t`, as `float` is most likely 32 bits (still use the assertion to make sure!) — too honest for this site, Jun 18 '17 at 00:27
there may be no standard printf format specifiers, but some C compilers did have a %b that would print binary (turbo or borland, one or both if I remember right, perhaps others). — old_timer, Jun 18 '17 at 01:21
@old_timer: Not those again, please! Do you know any halfway modern standard library (which is not necessarily part of the compiler) which supports it? (not that I don't agree `printf` etc. need a major overhaul anyway) — too honest for this site, Jun 18 '17 at 10:09
@Olaf in no way am I implying that the was anything standard that did, but the words "there is no printf format specifier" were incorrect. Had they said there is no .... "in any C library standard" that ... that would have been correct. — old_timer, Jun 18 '17 at 12:00

David C. Rankin · Accepted Answer · 2017-06-19T21:00:51.610

There are a large number of ways to accomplish this. Understand that what you are really trying to do is simply output the bits in memory that make up a float. Which in virtually all x86 type implementations are stored in IEEE-754 Single Precision Floating-Point Format. On x86 that is 32-bits of data. That is what allows a 'peek' at the bits while casting the float to unsigned (both are 32-bits, and bit-operations are defined for the unsigned type) For implementations other than x86, or even on x86 itself, a better choice for unsigned would be the exact length type of uint32_t provided by stdint.h. There can be no ambiguity in size that way.

Now, the cast itself isn't technically the problem, it is the access of the value though dereferncing the different type (a.k.a type-punning) where you run afoul of the strict-aliasing rule (Section 6.5 (7) of the C11 Standard). The union of the float and uint32_t types give you a valid way of looking at the float bits through an unsigned type window. (you are looking at the same bits either way, it's just how you access them and tell the compiler how they should be interpreted)

That said, you can glean good information from all of the answers here. You can write functions to access and store the bit representation of the float values in a string for later use, or output the bit values to the screen. As an exercise in playing with floating-point values a year or so back, I wrote a little function to output the bits in an annotated way that allowed easy identification of the sign, normalized exponent, and mantissa. You can adapt it or another of the answers routines to handle your needs. The short example is:

#include <stdio.h>
#include <stdint.h>
#include <limits.h> /* for CHAR_BIT */

/** formatted output of ieee-754 representation of float */
void show_ieee754 (float f)
{
    union {
        float f;
        uint32_t u;
    } fu = { .f = f };
    int i = sizeof f * CHAR_BIT;

    printf ("  ");
    while (i--)
        printf ("%d ", (fu.u >> i) & 0x1);

    putchar ('\n');
    printf (" |- - - - - - - - - - - - - - - - - - - - - - "
            "- - - - - - - - - -|\n");
    printf (" |s|      exp      |                  mantissa"
            "                   |\n\n");
}

int main (void) {

    float f = 3.14159f;

    printf ("\nIEEE-754 Single-Precision representation of: %f\n\n", f);
    show_ieee754 (f);

    return 0;
}

Example Use/Output

$ ./bin/floatbits

IEEE-754 Single-Precision representation of: 3.141590

  0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 1 1 1 1 1 1 0 1 0 0 0 0
 |- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -|
 |s|      exp      |                  mantissa                   |

Look things over and let me know if you have any questions.

Yes, agreed, the exact type of `uint32_t` would be a better choice for the union. (updated) — David C. Rankin, Jun 19 '17 at 20:55
A useful prevention from use on unicorn platforms that break code: `assert(sizeof fu.f == sizeof fu.u);` Only corner I see remaining is the endian of `float` and `uint32_t` may rarely differ rendering output in the opposite order than expected even with [binary32](https://en.wikipedia.org/wiki/Single-precision_floating-point_format) — chux - Reinstate Monica, Jun 21 '17 at 15:59
I guess we could add a check at the beginning of `uint32_t ui = 1U << 24; if (*(char *)&ui) {..handle big-endian..}` in addition to the `assert(sizeof fu.f == sizeof fu.u);` check. We'll leave those as a comment for anyone who may find themselves riding such a beast. — David C. Rankin, Jun 21 '17 at 18:42
`1U << 24` is UB on those pesky 16-unsigned machines - common in embedded land. Suggest other endian detection code - maybe [this](https://stackoverflow.com/q/25391730/2410359)? — chux - Reinstate Monica, Jun 21 '17 at 18:50
Darn unicorns - you are right once again! That's a pretty slick use of `sizeof int` as a defined constant, but the compiler won't argue with it. That will indeed handled the 16-bit boxes (and any box as long as an `int` is an even multiple of what a `char` is -- find a box where that isn't true and you indeed have a rare unicorn...) or maybe even `1U << (sizeof (int) - 1) * CHAR_BIT` — David C. Rankin, Jun 22 '17 at 13:10

Clifford · Answer 2 · 2017-06-18T11:13:17.583

1

There is no format specifier for binary output; generally hexadecimal (base 16) is used for convenience because a single hex digit represents exactly 4 binary digits. There is a format specifier for hexadecimal (%x or %X).

printf( "u1.u : %4X\n", u1.u ) ;

Alternatively you can generate a binary string representation with itoa() (non-standard, but commonly implemented function).

#include <limits.h>
#include <stdlib.h>
#include <stdio.h>

...

char b[sizeof(float) * CHAR_BIT + 1] = "" ;
printf( "u1.u : %s\n", itoa( u1.u, b, 2 ) ) ;

The problem with this is that it does not include the leading zeroes, and in the binary floating point representation all bits are significant. It is possible to deal with that, but somewhat cumbersome:

#define BITS (sizeof(float) * CHAR_BIT + 1) ; 
char b[BITS] = itoa( u1.u, b, 2 ) ;
printf( "u1.u : " ) ;
for( int i = 0; i < BITS - strlen(b); i++ )
{
    putchar( '0' ) ;
} 
printf( "%s\n", b ) ;

Note that in the above examples, the same implicit assumption as in the original question is made that unsigned is at least as large as a float and used the same byte-ordering (older ARM devices for example use a "cross-endian" floating point format!). I have made no attempt at portability in that respect. Ultimately if all you want to do is inspect the memory layout of a float then inspection in a debugger would be the simplest and most compiler implementation independent approach perhaps.

edited Jun 18 '17 at 11:13

answered Jun 17 '17 at 22:37

Clifford

88,407
13
85
165

2

I think that this `unsigned u = *((unsigned*)&f) ;` is undefined behavior. On contrary I read that use a union is not undefined behavior if you use `char`. – Stargateur Jun 17 '17 at 22:39
2

@Stargateur : Strictly speaking it is undefined, because there is no requirement for pointers of different types to be convertible between each-other, But in practice on most platforms for most types this is not a problem and since this is perhaps experimental code, portability is perhaps not an issue. You can use the union method in any case, since it was only your output method that was flawed - just use your u1.u where I have used u. – Clifford Jun 17 '17 at 22:44
1

this assumes that `unsigned`, which is an `int`, has the same size as `float`. – Bite Bytes Jun 17 '17 at 22:50
@BiteBytes : As does the code in the original question - that can be dealt with, but is perhaps a different question. – Clifford Jun 17 '17 at 22:52
2

Type punning via pointer invokes undefined behaviour. Only a `union` is safe here. Why use a hack if there is a compliant way (if some assertions are added and we ignore the output is implementation defined). And `char b[BITS]` will not work if there are no VLAs (I hate it, but C11 made them optional). `BITS` is not a constant. – too honest for this site Jun 18 '17 at 00:29
@BiteBytes: `unsigned` is **not** an `int`! Both are _integer types_, though. – too honest for this site Jun 18 '17 at 00:32
A union is not safe here either, just happens to work, as do other methods, there is no "safe" solution in C within memory. You can for example write to a file then read it back, doing all the expected size checks in and out. Most folks just use a union and get away with it, but it is undefined behavior territory. – old_timer Jun 18 '17 at 01:20
@old_timer you're absolutely incorrect here, the union access is very well defined in C (you might be thinking some obsoleted version of the standard...) – Antti Haapala -- Слава Україні Jun 18 '17 at 05:06
Since it is contentious and not really part of the answer, I have removed the suggestion to use pointer casts and any comment on the method used . – Clifford Jun 18 '17 at 05:42
I still think `BITS` should be a macro. The variable does not make really sense. Anyway, as I advocate for VLAs and consider a compiler not supporting it incomplete, I can live with it. I still don't see the advantage using `itoa` here. A simple loop would be faster and not use more code. Also the answer should include a warning about sizes, encoding and endianess. – too honest for this site Jun 18 '17 at 10:06
@Olaf : I generally use C++, so have lost the macro habit. In C++ the const declaration may be used to size an array w/o VLA (which is not a thing in C++). Changing it now. – Clifford Jun 18 '17 at 10:35
I know `const` has a different meaning in C++. That's one reason they are different languages. Interestingly, it works in both, just because the code uses **two** such differences: `const` an VLAs. – too honest for this site Jun 18 '17 at 10:43
@Olaf : Sorry I am not trying to teach you something you already know; just explaining my "error" - for the benefit of other readers. I agree with you entirely. I would not rely on VLA either, for three reasons (related to the constraints of the systems I generally develop): compiler support, lack of C++ compatibility, unpredictable stack usage. – Clifford Jun 18 '17 at 10:55
@Olaf : The use of `itoa()` is from my assumption that this is throwaway experimental code rather than for any production purpose (which would be hard to imagine), and itoa() requires no thought or debugging. It is in no way a recommendation - the point is merely that if you specifically want binary output you have to code it one way or another, and this is the most succinct illustration I could imagine. Had I provided an implementation, that would no doubt have attracted other critiques of the form "I have a better way", but that is hardly relevant to the answer in this case. – Clifford Jun 18 '17 at 11:01
@Olaf : The final example it has to be said is particularly "lazy" - has a `strlen()` and an `itoa()`. I am still minded to leave any "improvement" to the reader as I think it may obfuscate the answer. – Clifford Jun 18 '17 at 11:09
@Clifford: I disagree about trying to write C code which is c++ compatible. I didn't understand it as tuturing me, all fine:-). Wrt writing compatible code: There are way more subtle differences and such code would not be good code in any of both languages. A compiler not supporting VLAs is typically broken, as all modern compilers support C99, where VLAs are mandatory (C11 made them optional - which is a drastic deviation from their agenda: compatibility up to the ridiculuous). If you have a very small/special arch like PIC, you have to write special code anyway, no harm done … – too honest for this site Jun 18 '17 at 11:25
… on embedded one should always keep the hardware in mind. One should have more than a hammer in the toolbox; not every problem is a nail. – too honest for this site Jun 18 '17 at 11:26
"Had I provided an implementation …" - That's why we are not a coding service. But yes, I see your point. And I don't recommend (actually I think it would be not a good idea) to add checking code; just some warnings in the text; From the question, OP is clearly quite cluesless about such vital aspects of the C language (and C++, too). – too honest for this site Jun 18 '17 at 11:29
@Olaf : You need not agree or disagree; it was merely a reason why I personally avoid VLA - I almost exclusively code for embedded systems in C++, and when I do write C code generally keep within the subset that is valid C++. Indeed not everything is a nail - that's why I use C++; it was a bugger toolbox long before C99 and C11. Th ediscussion is however largely off topic. – Clifford Jun 19 '17 at 22:42

chux - Reinstate Monica · Answer 3 · 2017-06-18T00:55:22.953

1

To convert any variable/object to a string that encodes the binary, see how to print memory bits in c

print ... as bits (such as 0001 0110, or something similar),

Something similar: Use "%a" to print a float, converted to a double showing its significant in hexadecimal and exponent in decimal power of 2. @Jonathan Leffler

printf("%a\n", 3.14159f);
// sample output
0x1.921fap+1

edited Jun 18 '17 at 00:55

answered Jun 17 '17 at 22:47

chux - Reinstate Monica

143,097
13
135
256

OP wants the binary representation of a `float`, not the floating point in hex format. – too honest for this site Jun 18 '17 at 00:33
1

OP expressly said "or something similar" and so the alternative and especial since `""%a"` is part of the C standard library, it desires consideration. Commonly binary representation is compressed into hexadecimal as it is trivial to discern one from the other. – chux - Reinstate Monica Jun 18 '17 at 00:54
Printing the mantissa and exponent is not similar to a binary representation. A hex output of the bit-pattern would be, though. – too honest for this site Jun 18 '17 at 12:38

score 1 · Answer 4 · answered Jun 17 '17 at 22:55

You could write a simple print_bits-function and use an array of unsigned characters to read out the "raw memory representation" of a float:

void print_bits(unsigned char x)
{
    int i;
    for (i = 8 * sizeof(x) - 1; i >= 0; i--) {
        (x & (1 << i)) ? putchar('1') : putchar('0');
    }
}

typedef float ftype;

union ufloat {
    ftype f;
    unsigned char bytes[sizeof(ftype)];
};

int main( ) {
    union ufloat u1;
    u1.f = .1234;

    for (int i=0; i<sizeof(ftype); i++) {
        unsigned char b = u1.bytes[i];
        print_bits(b);putchar('-');
    }
    return 0;
}

Not sure if the union is actually required (I suppose you introduced this because of alignment issues and UB); When using an array of unsigned char, alignment should not be an issue.

`unsigned char x .... i = 8 * sizeof(x) - 1` is always `i = 7`. Could use `i = CHAR_BIT - 1`. [ref](https://stackoverflow.com/q/3200954/2410359). — chux - Reinstate Monica, Jun 17 '17 at 23:12

old_timer · Answer 5 · 2017-06-18T14:09:24.393

0

#include <stdio.h>

union
{
    float f;
    unsigned int u;
} myun;

int main ( void )
{
    unsigned int ra;

    printf("%p %lu\n",&myun.f,sizeof(myun.f));
    printf("%p %lu\n",&myun.u,sizeof(myun.u));
    myun.f=3.14159F;
    printf("0x%08X\n",myun.u);
    for(ra=0x80000000;ra;ra>>=1)
    {
        if(ra&myun.u) printf("1"); else printf("0");
    }
    printf("\n");

    for(ra=0x80000000;ra;ra>>=1)
    {
        if(ra==0x40000000) printf(" ");
        if(ra==0x00400000) printf(" ");
        if(ra&myun.u) printf("1"); else printf("0");
    }
    printf("\n");

    return(0);
}

0x601044 4
0x601044 4
0x40490FD0
01000000010010010000111111010000
0 10000000 10010010000111111010000

edited Jun 18 '17 at 14:09

answered Jun 18 '17 at 02:15

old_timer

69,149
8
89
168

`soft-float` is not the default for ARM-whatever-whatever. It depends on whether an FPU is available. A lot of ARM MCUs today have an FPU (just that it only supports single precision), the larger ARMv7A definitively have hard-float (maybe with varying types). Not that the type of float processing makes any difference on ARM - all are based on IEEE754. – too honest for this site Jun 18 '17 at 09:53
"This one wanders into the implementation defined or undefined behavior …" - please elaborate. Type-punning through a `union` is explicitly allowed. – too honest for this site Jun 18 '17 at 09:55
Rather then compile and disassemble, I would suggest memory inspection in a debugger. – Clifford Jun 18 '17 at 10:41
1

@Olaf : ARM Cortex-M7 (ARMv7-M) can have a double precision FPU - so even on ARM MCUs full double-precision hardware support is now widely available. – Clifford Jun 18 '17 at 10:48
@Clifford: The CM7 is quite a larger iron than the CM4F (the CM3 is mostly deprecated and CM4 w/o `F` quite rare) with it's multiple and 64 bit AXI bus interfaces, caches (typically) and target clock speed (>200MHz). I intentionally did not mention it. My point was `soft-float` should not be seen as default; whether the CPU supports `double` or not is not relevant. (As a sidenote, if I may: I really wonder why they made VLAs optional, but float is not - although at the time of release of C11, most code did not need/use them) – too honest for this site Jun 18 '17 at 11:35
fair enough the default config when building gcc then if you dont specify a processor (arm-whatever-gcc -O2 -c so.c -o so.o; arm-whatever-objdump -D so.o) you default to the common denominator. – old_timer Jun 18 '17 at 12:04
It was painfully obvious I didnt build for a cortex-m. – old_timer Jun 18 '17 at 12:04
1

@Olaf, just read the spec, any version. After however many years, decades, waiting for someone to show text that actually shows it is supported. I still cant find a version. You are welcome to post an answer and also include the text. And I will remove the comment. I provided a union solution, it shows the binary, and even goes so far as to isolate the bitfields. What the OP asked for. At the time I posted this the only one that had done that. Plus a few other methods, yet the downvote, interesting, but also expected, this is religion and politics topic. – old_timer Jun 18 '17 at 12:09
@Clifford, sure but you would need to make sure it is in memory not in a register, although one would hope with the debugger you could also examine registers, but you would need to know which by looking at the disassembly (or hope that the debugger helps you out there by looking at the variable). – old_timer Jun 18 '17 at 12:17
@old_timer: Read 6.5.2.3p3. Footnote 95 explicitly mentions type-punning. Note tht it still is implementation defined due to encoding, sizes, etc. But not UB. – too honest for this site Jun 18 '17 at 12:24
@old_timer: If compiled for debugging (-Og` on gcc), it should not matter where a variable is located. – too honest for this site Jun 18 '17 at 12:32
@olaf, which version? and implementation defined is enough to warn away from using something, very much in the "works on my machine" trap. – old_timer Jun 18 '17 at 12:32
@olaf, sure, but you have to build to be debugged, rather than for runtime. And that is all I was trying to point out. gotta know how to use your debugger and tools for that one to work, where a simple inspection of the code is far more reliable and less risky. – old_timer Jun 18 '17 at 12:33
@old_timer: There is only **one** C standard! And by default a normal compiler builds without optimisations, which is also fine. If you don't add debugging symbols, you did something wrong; we should expect normal practice and we are not a tutoring sitte teaching people how to build programs and debug. – too honest for this site Jun 18 '17 at 12:34
@Olaf There are a number of different versions of the standard, if there was never a reason to re-write them, then there would be only one C standard. The language pre-dates the standard so they coulda/shoulda got it right on the first try. Yet they didnt so there are differences. And as well all know you can specify the version on the command (of some compilers) line so you can fall into the different interpretations of the language. – old_timer Jun 18 '17 at 12:40
@old_timer: "There are a number of different versions of the standard" - none of them is **standard C**! Read the foreword of the current version. The 2nd and 1st release have explicitly been canceled with the release of the respective successor. C standard is ISO9899, the only valid version is ISO9899:2011. – too honest for this site Jun 18 '17 at 12:42
Because of the sheer number of times I have had to dig people out of the works on my machine hole, because they dont bother to understand the language or implementations, it is well worth tutoring. That or just keep getting paid to dig people out of these holes, its the fireman vs fire marshall thing, do you teach them not to or just put their house out when they catch it on fire. The initial difference is when that person damages the company you work for or not. – old_timer Jun 18 '17 at 12:42
@old_timer: For my experience, tutoring people who are not able to learn/do research the basics on their own is pointless. It's more like teaching to fish vs. giving a fish. And people burning their second house after they burnt their first are hopeless anyway. – too honest for this site Jun 18 '17 at 12:44
@olaf implementation defined is the primary area of risk, granted if you spend your career on one compiler on one platform, sure you are pretty safe, and that is why folks with 10-20 years experience and hardend habits are the worst at these violations, they wander off a desktop into a phone app or something else on a different target, same compiler or different compiler and then get educated and then struggle... – old_timer Jun 18 '17 at 12:45
Just about the worst thing you can do is tell someone here is a solution that works and is legal, when it doesnt always work. So disclaimers are required, particularly when teaching to fish. – old_timer Jun 18 '17 at 12:46
@old_timer: Oh, I use(d) enough different compilers to know in the last 30 years (including other languages and some self-written compilers). That's why I first check what the compiler claims to be compatible to. To be clear: that's not the only measure I take. You really should read thissite's mission. We are a Q&A site, not a tutoring/teaching site! You hardly can cover all broken compilers which call themselves "C compiler", e.g. MSVC. And that's my last post about this subject. You have the last word. – too honest for this site Jun 18 '17 at 12:48
@Olaf You have five years on me then. And have seen what I have seen and then some. And dug people out of holes as I have. And should not allow others to lead them into holes as other posters continue to do, only to have them come back at some point in the future with a more serious problem because of the mentor they went with. You have seen the 16 bit to 32 bit to 64 bit which way overshadow the language nuances in compiler variances on the language. If this site is about leading folks into danger, then what good is that? – old_timer Jun 18 '17 at 12:55
@Olaf you have also then seen the countless "standard compliant" compilers that...were not...over those decades...And have that wisdom to share as well. – old_timer Jun 18 '17 at 12:56
@old_timer: Ok, I'll bite: I have all this. But I also left the 1980ies, 90ies and 2000s. Some things **have changed**. (and I also covered the 8 bitters - 30 years is only about compiler languages; there are some more in total). All this is **not** relevant here! If you want to tutor, open another blog, youtube video channel, etc. And yet you can never cover all aspects! – too honest for this site Jun 18 '17 at 12:58

Obtaining bit representation of a float in C

5 Answers5

Linked