1

My impression is that C float has 8 bits of exponent and 23 bits of mantissa.

So one is 0011 1111 1000 0000 0000 0000 0000 0000 = 0x3F800000.

However, the following code produced 1.06535e+09 instead of 1. Can anyone help me understand why?

#include <iostream>
#include <math.h>  

using namespace std;

int main()
{
    float i = 0x3F800000;
    cout<<i << endl;
    return 0;
}
user1559897
  • 1,454
  • 2
  • 14
  • 27
  • I'm pretty sure you're not assigning that bit pattern to `i`, but that particular integer. – Eric J. Jun 21 '19 at 21:44
  • I'm trying to figure out why you think `0011 1111 1000 0000 0000 0000 0000 000` or `0x3F800000` should be equal to 1. Can you explain the thought process behind this? – Mooing Duck Jun 21 '19 at 21:44
  • Isnt that the standard IEEE representation? See here https://en.wikipedia.org/wiki/Single-precision_floating-point_format – user1559897 Jun 21 '19 at 21:45
  • Although I haven't used C in years, I always believed that the specific representation of `float` and `double` depends on the target architecture, not on the C standard. Is my impression incorrect? – Codor Jun 21 '19 at 21:48
  • @EricJ. looks like so. it is casting the hex to unsigned int and casting the unsigned int to float. How do I directly assign hex to float? – user1559897 Jun 21 '19 at 21:50
  • Please not 0x3f800000 is the IEEE 754 single precision representation of 1.0 only. C can be used on systems that do not support IEEE 754 (though such systems are probably rare), and furthermore, there's no guarantee the type `float` is 32-bits. You can use `uint32_t` from `` at least. – Ray Toal Jun 21 '19 at 22:06
  • The question mentions C but shows C++ code. – Eric Postpischil Jun 21 '19 at 22:58

2 Answers2

2

How is 1 coded in C as a float?

Can anyone help me understand why (code fails)?

float i = 0x3F800000;

is the same as i = 1065353216;


In C, to overlay the bit pattern use a union or use memcpy().

In C++, suggest memcpy().

Using a cast risks failure due to anti-aliasing. @Eric Postpischil

#include <stdio.h>
#include <stdint.h>

_Static_assert(sizeof(float) == sizeof(uint32_t), "Unexpected types");

int main(void) {
  union {
    uint32_t u; 
    float f;
  } x = {.u = 0x3f800000};
  float f = x.f;
  printf("%e\n", f);
  return 0;
}

On less common systems, this can fail due to

  • float is not binary32.

  • Endian differs between float/uint32

Community
  • 1
  • 1
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
1

Using IEEE-754, the floating point number 1 is written as:

0 01111111 00000000000000000000000 (base-2) = 3f80 0000 (base-16)

So, your assumption is correct. Unfortunately, the bit-pattern represented by 0x3f800000 cannot be assigned to a float by just doing:

float a = 0x3f800000

The hexadecimal number will first be converted to an unsigned integer which has the value 1065353216 in base-10. This number will then implicitly be converted to the nearest floating-point number.

So in short, while your bit-pattern for an IEEE-754 floating point number is correct, your assumption how to assign this pattern is incorrect. Have a look at Convert a hexadecimal to a float and viceversa in C how this can be achieved or the other answers in this question.

kvantour
  • 25,269
  • 4
  • 47
  • 72