union int bits to float bits sometimes interpreted wrong

Question

I just discovered some dodgy problems when i was interleaving some floats. I've simplified the issue down and tried some tests

#include <iostream>
#include <vector>

std::vector<float> v; // global instance

union{ // shared memory space
    float f; // to store data in interleaved float array
    unsigned int argb; // int color value
}color; // global instance

int main(){
    std::cout<<std::hex; // print hexadecimal

    color.argb=0xff810000; // NEED A==ff AND R>80 (idk why)
    std::cout<<color.argb<<std::endl; // NEED TO PRINT (i really dk why)
    v.insert(v.end(),{color.f,0.0f,0.0f}); // color, x, y... (need the x, y too. heh..)

    color.f=v[0]; // read float back (so we can see argb data)
    std::cout<<color.argb<<std::endl; // ffc10000 (WRONG!)
}

the program prints

ff810000
ffc10000

If someone can show me i'm just being dumb somewhere that'd be great.

update: turned off optimizations

#include <iostream>

union FLOATINT{float f; unsigned int i;};

int main(){
    std::cout<<std::hex; // print in hex

    FLOATINT a;
    a.i = 0xff810000; // store int
    std::cout<<a.i<<std::endl; // ff810000

    FLOATINT b;
    b.f = a.f; // store float
    std::cout<<b.i<<std::endl; // ffc10000
}

or

#include <iostream>

int main(){
    std::cout<<std::hex; // print in hex

    unsigned int i = 0xff810000; // store int
    std::cout<<i<<std::endl; // ff810000

    float f = *(float*)&i; // store float from int memory

    unsigned int i2 = *(unsigned int*)&f; // store int from float memory
    std::cout<<i2<<std::endl; // ffc10000
}

solution:

#include <iostream>

int main(){
    std::cout<<std::hex;

    unsigned int i=0xff810000;
    std::cout<<i<<std::endl; // ff810000

    float f; memcpy(&f, &i, 4);
    unsigned int i2; memcpy(&i2, &f, 4);

    std::cout<<i2<<std::endl; // ff810000
}

Unrelated: There's a member function specifically for inserting stuff at the end called [`push_back`](https://en.cppreference.com/w/cpp/container/vector/push_back). — eesiraed, Jun 04 '19 at 00:17
@drescherjm ok that makes sense out of this oddness. what should i do instead? — Puddle, Jun 04 '19 at 00:17
@drescherjm well, i have a nice opengl program with interleaved vertex data. it's working nicely. (or so it seemed till i got lucky and found certain colors weren't rendering right) rather than inserting 3 floats (r,g,b) we can insert 1 float (with the argb data) which also increases fps. how should i go about putting the color integer into the interleaved array of floats? i'm using a vector. i guess that wasn't the issue. how should i convert to a float? — Puddle, Jun 04 '19 at 00:22
@ChrisRollins i'm not trying to put 3 floats in a union. i'm putting 3 floats in the vector. one of those floats is from my union. for some oddity i needed to insert 3 floats for this oddity to happen. — Puddle, Jun 04 '19 at 00:24
so it seems like what you're doing is reinterpreting the float as int which is UB and also guaranteed to give you the wrong value — Chris Rollins, Jun 04 '19 at 00:25
@ChrisRollins i wouldn't say "guaranteed" (but possible) there were a few conditions i had to set for it fail. one as weird as needing to print before inserting. (when i was simplifying this, at one point i had to have the insert inside another function) can anyone tell me how i can accomplish this properly in c++? — Puddle, Jun 04 '19 at 00:29
why do the argb values need to be float? usually we use unsigned int — Chris Rollins, Jun 04 '19 at 00:37
but if you do want to convert from one numeric type to another you should use static_cast. this will give you the same value. — Chris Rollins, Jun 04 '19 at 00:39
What is the `float` supposed to be representing? An `uint32_t` can represent 0x00-0xFF for the four channels, but I don't understand what you are trying to accomplish. Btw, that particual bit pattern is `-nan` (for the float) on my machine and something else on someone elses machine. — Ted Lyngmo, Jun 04 '19 at 00:40
@TedLyngmo an interleaved VBO with `glColorPointer(4,GL_UNSIGNED_BYTE,12,(void*)(0));` and `glVertexPointer(2,GL_FLOAT,12,(void*)(4));` a.k.a 4 unsigned bytes for color, and 2 floats for position. which is 4, 8, 12 (int,float,float) bytes. instead of 3 rgb floats. which would waste memory and performance. `glBufferData` takes a void pointer to the data. i'm using a vector to put the data together. everything is floats except the color. so to put that int data into that void ptr (a float array / vector), what should i do to do it properly? — Puddle, Jun 04 '19 at 00:47
make a struct that holds each set of data and make a vector of those structs — Chris Rollins, Jun 04 '19 at 00:50
@ChrisRollins that's a good suggestion actually. thanks for bringing it up. i saw someone doing it that way not long ago on this site. but i guess i just like to be different. lol but besides that, for sake of being more experienced with c++ and not repeat the error here with unions, it's always good to learn how to accomplish it properly. — Puddle, Jun 04 '19 at 00:58
reinterpret_cast is the closest thing to "proper" when it comes to reinterpreting data without altering the binary. but generally it's not a good practice to do that at all. the same issues would probably still occur. types are supposed to work for you, not against you. that's why I suggest the struct approach. — Chris Rollins, Jun 04 '19 at 01:00

score 4 · Answer 1 · answered Jun 04 '19 at 00:49

4

The behavior you're seeing is well defined IEEE floating point math.

The value you're storing in argb, when interpreted as a float will be a SNaN (Signaling NaN). When this SNaN value is loaded into a floating point register, it will be converted to a QNaN (Quiet NaN) by setting the most significant fraction bit to a 1 (and will raise an exception if floating point exceptions are unmasked).

This load will change your value to from ff810000 to ffc10000.

answered Jun 04 '19 at 00:49

1201ProgramAlarm

32,384
7
42
56

@Puddle If the value you're storing in `argb` happens to correspond with a SNaN format, the number will be changed when it is converted to a QNaN. Which is one of the ways accessing a union member that was not the last one written to is Undefined Behavior. – 1201ProgramAlarm Jun 04 '19 at 01:02
@Puddle It depends on how the compiler moves the value. If it uses a general purpose register the value won't change. If it uses a floating point register the value will be converted from a SNaN to a QNaN. – 1201ProgramAlarm Jun 04 '19 at 01:24

score 3 · Accepted Answer · answered Jun 04 '19 at 01:08

3

Writing to the int and then reading from the float in the union causes UB. If you want to create a vector of mixed value types, make a struct to hold them. Also, don't use unsigned int when you need exactly 32 bits. Use uint32_t.

#include <iostream>
#include <vector>

struct gldata {
    uint32_t argb;
    float x;
    float y;
};

std::vector<gldata> v;

int main() {
    std::cout << std::hex; // print hexadecimal

    v.emplace_back(gldata{0xff810000, 0.0f, 0.0f});

    std::cout << v[0].argb << "\n"; // 0xff810000
}

answered Jun 04 '19 at 01:08

Ted Lyngmo

93,841
5
60
108

that's a solution to achieving interleaved data. but avoids the problem i ran into. what if i ever just wanted to do the function `intBitsToFloat(0xff810000);` (which is a perfectly working equivalent i used in java) – Puddle Jun 04 '19 at 01:15
I don't see this problem mentioned in the question. What floating value will get back from that function? – Ted Lyngmo Jun 04 '19 at 01:21
the same that'll get back from the union float. (it's just for some odd reasons, like printing, inserting 3 floats at once, sometimes it's reading the value wrong) – Puddle Jun 04 '19 at 01:24
I still dont get it. Why would you ever need to store the four channel values in a `float`? – Ted Lyngmo Jun 04 '19 at 01:47
i'm just gonna use this simple struct method for interleaving the data. but the obvious point was i were wondering how exactly one COULD use a float array. (like i could, and needed to in java) to put those exact int bits in the float bits. ints and floats are just imaginary. it's all about interpretation. the raw binary is all that matters. you could create an int and a float pointer to the same array. when reading/writing, the compiler will convert based on the imaginary type. – Puddle Jun 04 '19 at 01:56
sizeof(int) == sizeof(float) == 4. it doesn't even matter if i make the int unsigned. if i'm putting a value which uses the msb, again, the compiler will interpret it as a negative number. but to us the unsigned interpretation underneath is still the same. you can even use -1 to mean 0xffffffff. – Puddle Jun 04 '19 at 01:58
You can memcpy the `uint32_t` into the memory of `float` but if you copy in a signalling NaN or an otherwise invalid bit patter for `float` you may get strange effects when using it (like that it changes to a Quiet NaN). It's better to use the correct type to avoid surprises later. – Ted Lyngmo Jun 04 '19 at 02:11
1

thanks for the info. again i'll just use the struct to allocate the memory. it'll be faster since i wont have to convert the ints anymore. they'll just go straight in. i'ma go ahead and accept this answer. – Puddle Jun 04 '19 at 02:16
Concerning `intBitsToFloat` -- Java requires IEEE-754 floating-point math, so the effects of that function are well defined. C++ does not require IEEE-754. If you know that your hardware provides IEEE-754 and that your compiler supports it (which is almost always the case), writing the function is fairly straightforward. – Pete Becker Jun 04 '19 at 03:13
1

hey again. i kinda realized making my vector with floats (4 bytes per data) were useful because i don't just use it for one definition of a vbo. sometimes i might fill it with {col,x,y} sometimes i may use {tx,ty,col,x,y}. i could make a union with each struct type, but then each element in the vector will be of the biggest struct size. so if i were putting {col,x,y} there'll be an extra 8 bytes unused. and when uploading it to the vbo, that'll be wasted space. so i'll have to just make a union of a float and int. and insert these 4 byte unions (to represent floats/ints) into the vector. – Puddle Jun 04 '19 at 22:47
@Puddle Ouch ... But what do you need? Your queue may be challenged with a `double` anyday. What's the purpose of your queue? (you're on a very slippery slope) – Ted Lyngmo Jun 06 '19 at 09:52
@TedLyngmo my last comment said what i needed with the obvious solution. i need a buffer to hold interleaved vertex data (as you already know) but also be able to interleave different vertex data. (say for texture coords) and to not waste memory, we have to make the buffer essentially a float/int buffer. you say it may be challenged with a double? i'ma need more detail on that. purpose of my queue? what queue? i'm on a slippery slope? i'ma need more detail on that. – Puddle Jun 09 '19 at 00:52
Regarding the "slippery slope": It may compile and may it work for years and years. It's still undefined. Why push it? – Ted Lyngmo Jun 09 '19 at 01:17

union int bits to float bits sometimes interpreted wrong

2 Answers2