2

Sorry if this is already been asked, and I've seen other way of extracting the exponent of a floating point number, however this is what is given to me:

unsigned f2i(float f)
{
  union {
    unsigned i;
    float f;
  } x;
  x.i = 0;
  x.f = f;
  return x.i;
}

I'm having trouble understanding this union datatype, because shouldn't the return x.i at the end always make f2i return a 0?

Also, what application could this data type even be useful for? For example, say I have a function:

int getexponent(float f){
}

This function is supposed to get the exponent value of the floating point number with bias of 127. I've found many ways to make this possible, however how could I manipulate the f2i function to serve this purpose?

I appreciate any pointers!

Update!! Wow, years later and this just seem trivial. For those who may be interested, here is the function!

int getexponent(float f) {
    unsigned f2u(float f); 
 unsigned int ui = (f2u(f)>>23) & 0xff ;//shift over by 23 and compare to 0xff to get the exponent with the bias 
 int bias = 127;//initialized bias 
 if(ui == 0) return 1-bias; // special case 0
 else if(ui == 255) return 11111111; //special case infinity
 return ui - bias;
}
Nam Vu
  • 1,727
  • 1
  • 11
  • 24
  • To the first question, no, because `x.i` and `x.f` share the same data space. To extract the exponent, you also need to shift and mask the `unsigned` value. – Weather Vane Sep 21 '17 at 08:49
  • 1
    FYI : [frexp](http://en.cppreference.com/w/c/numeric/math/frexp) – BLUEPIXY Sep 21 '17 at 09:04
  • @Lundin I think the hope is that `x.i.` contains the same bits as `x.f` at the end. The initial `x.i = 0` is probably an attempt to ensure that if `unsigned int` is bigger than `float` the unused bits are zeroed out. – JeremyP Sep 21 '17 at 09:16
  • @JeremyP I believe that if unsigned int is bigger than float and big endian is used, the result would be garbage. This code needs to either use `uint32_t` or some static assert about the type sizes. – Lundin Sep 21 '17 at 09:23
  • @Lundin I agree. Also, is a `float` guaranteed to be four bytes? – JeremyP Sep 21 '17 at 09:25
  • @JeremyP Not in theory, but in practice. If paranoid, add static asserts. – Lundin Sep 21 '17 at 09:46

4 Answers4

1

Because it's a union it means that x.i and x.f have the same address, what this allows you to do is reinterpret one data type to another. In this scenario the union is first zeroed out by x.i = 0; and then filled with f. Then x.i is returned which is the integer representation of the float f. If you would then shift that value you would get the exponent of the original f because of the way a float is laid out in memory.

Hatted Rooster
  • 35,759
  • 6
  • 62
  • 122
  • 1
    Just to be absolutely clear, this technique is totally non portable, depending,as it does on the underlying representation of a float and an int in the implementation. – JeremyP Sep 21 '17 at 09:10
1

I'm having trouble understanding this union datatype, because shouldn't the return x.i at the end always make f2i return a 0?

The line x.i = 0; is a bit paranoid and shouldn't be necessary. Given that unsigned int and float are both 32 bits, the union creates a single chunk of 32 bits in memory, which you can access either as a float or as the pure binary representation of that float, which is what the unsigned is for. (It would have been better to use uint32_t.)

This means that the lines x.i = 0; and x.f = f; write to the very same memory area twice.

What you end up with after the function is the pure binary notation of the float. Parsing out the exponent or any other part from there is very much implementation-defined, since it depends on floating point format and endianess. How to represent FLOAT number in memory in C might be helpful.

Lundin
  • 195,001
  • 40
  • 254
  • 396
1

I'm having trouble understanding this union datatype

The union data type is a way for a programmer to indicate that some variable can be one of a number of different types. The wording of the C11 standard is something like "a union contains at most one of its members". It is used for things like parameters that may be logically one thing or another. For example, an IP address might be an IPv4 address or an IPv6 address so you might define an address type as follows:

struct IpAddress
{
    bool isIPv6;
    union 
    {
        uint8_t v4[4];
        uint8_t v6[16];
    } bytes;
}

And you would use it like this:

struct IpAddress address = // Something
if (address.isIPv6)
{
    doSomeV6ThingWith(address.bytes.v6);
}
else 
{
    doSomeV4ThingWith(address.bytes.v4);
}

Historically, unions have also been used to get the bits of one type into an object of another type. This is because, in a union, the members all start at the same memory address. If I just do this:

float f = 3.0;
int i = f;

The compiler will insert code to convert a float to an integer, so the exponent will be lost. However, in

union 
{
    unsigned int i;
    float f;
} x;

x.f = 3.0;
int i = x.i;

i now contains the exact bits that represent 3.0 in a float. Or at least you hope it does. There's nothing in the C standard that says float and unsigned int have to be the same size. There's also nothing in the C standard that mandates a particular representation for float (well, annex F says floats conform to IEC 60559 , but I don't know if that counts as part of the standard). So the above code is, at best, non portable.

To get the exponent of a float the portable way is the frexpf() function defined in math.h

how could I manipulate the f2i function to serve this purpose?

Let's make the assumption that a float is stored in IEC 60559 format in 32 bits which Wkipedia thinks is the same as IEEE 754. Let's also assume that integers are stored in little endian format.

union 
{
    uint32_t i;
    float f;
} x;

x.f = someFloat;
uint32_t bits = x.i;

bits now contains the bit pattern of the floating point number. A single precision floating point number looks like this

SEEEEEEEEMMMMMMMMMMMMMMMMMMMMMMM
^        ^                     ^
bit 31   bit 22                bit 0

Where S is the sign bit, E is an exponent bit, M is a mantissa bit.

So having got your int32_t you just need to do some shifting and masking:

uint32_t exponentWithBias = (bits >> 23) & 0xff;
JeremyP
  • 84,577
  • 15
  • 123
  • 161
  • I'm sorry, could explain the (bits >> 23) & 0xff part? I'm guessing that it look at the first 23 bits and omit the first bit? – Nam Vu Sep 21 '17 at 18:03
  • @NamBurger It just means shift the bits right 23 places so that the exponent starts at bit 0 then and it with binary 11111111 to get rid of the sign bit, leaving just the exponent. – JeremyP Sep 25 '17 at 08:38
1

That union type is strongly discouraged, as it is strongly architecture dependant and compiler implementation dependant.... both things make it almost impossible to determine a correct way to achieve the information you request.

There are portable ways of doing that, and all of them have to deal with the calculation of logarithm to the base ten. If you get the integer part of the log10(x) you'll get the number you want,

int power10 = (int)log10(x);

double log10(double x)
{
     return log(x)/log(10.0);
}

will give you the exponent of 10 to raise to get the number to multiply the mantissa to get the number.... if you divide the original number by the last result, you'll get the mantissa.

Be careful, as the floating point numbers are normally internally stored in a power of two's basis, which means the exponent you get stored is not a power of ten, but a power of two.

Luis Colorado
  • 10,974
  • 1
  • 16
  • 31