how to compare two different union in C

Question

I'm just wondering how can I compare a union with another integer, my purpose is to code a sort of printf like and I'm managing the simple case like %d/%u/%i considering the size conversion : ll/l/hh/h/j/z, so basically I've the following union:

union all_integer
{
    char                        c;
    signed int                  nb;
    short int                   snb;
    long int                    lnb;
    long long int               llnb;
    size_t                      posnb;

    unsigned char               uc;
    unsigned int                unb;
    unsigned short int          usnb;
    unsigned long int           ulnb;
    unsigned long long int      ullnb;
};

because I can't know before which type I'll need to receive, and after that when I've something like that in case I use %d:

union all_integer u_allint;
u_allint.nb = va_arg(ap, int);

I want to print my data which is in my union u_allint so I give my union to a simple function for putmydata for example:

putdata(union all_integer u_allint)
{
     if (u_allint < 0)
     {
          return (ft_numlen_neg(u_allint));
     }
     if (u_allint > 9)
         return (1 + ft_numlen(u_allint / 10));
     if (u_allint > 0 && u_allint < 10)
          return (1);
     if (u_allint == 0)
        return (1);
     return (0);
}

just suppose that this function is capable to print correctly my data and the fact is I can't do that because i compare an union with an int and even if I try to do an other union in my function and give newunion.nb = 0 for have an union int compare with an union int, I can't compile with this message : invalid operands to binary expression ('union my_union' and 'union my_union').

so I'm pretty sure that I misunderstand something about union, but I didn't find a similar problem in other topic, so am I misunderstand something or maybe taking the problem by the wrong way?

Forgetting signedness for a second and considering only size, why not use `unsigned long long int` in all cases, instead of a union? — coredump, Jan 24 '18 at 12:15
yep, i just hadn't thinked about that, i was thinking that unions were a pretty smart way to deal with my problem because even if you don't know your type of data before thanks to union i wasn't forced to declare all the variable i'll probably need, but i will rethink my way of doing that with unsigned long long int, thanks for help ! — abt jeremie, Jan 24 '18 at 13:23

score 1 · Answer 1 · answered Jan 24 '18 at 11:45

Yes - you're confused what a union is.

The union takes up as much memory as the largest item in it; and when you attempt to compare the value of it with >; the compiler doesn't know what to do; since the representation of the int = 0 and the representation of long = 0 may be different (since the int may not have 0'ed the bytes after it).

Using your union in a printf will be interesting too; as the %d tells printf to take the next sizeof(int) bytes that's in the arguments and assume it's an int. As you've got extra data left over; it will be read for the next part of printf - leaving it very confused and printing likely some rubbish (but it won't crash, as you're reading valid memory).

LucaG · Answer 2 · 2018-01-26T10:57:01.520

0

Your request isn't clear.

However it's normal that compiler give you error in both case. For compare union you have to write some code like this one:

all_integer union1;
all_integer union2;

/*To ensure all the unused data of the unions are the same, it's
necessary to set unions, before to use it, at the same value (0 in this case).*/
memset(&union1,0,sizeof(all_integer));
memset(&union2,0,sizeof(all_integer));

...

if(!memcmp(&union1,&union2,sizeof(all_integer))
{
    //Unions are equal.
    ...
}
else
{
    //Unions aren't equal.
    ...
}

Edit: follow the suggestions in this link

edited Jan 26 '18 at 10:57

answered Jan 24 '18 at 11:49

LucaG

74
9

my question is : i recup a data in a union because i can't know in advance what type it will be, maybe it'll be a long or a long long or something, so that's why i used a union, but afterthat i obviously need to manipulate my data with some <= 9 or other comparason for print this data ( nb : i can't use printf because it's an exercise with some restrictions ), so how can i solve that, have i just to do a function for print my data for each type like one function for long one for long long one for unsigned int etc ... or there is an other way that i don't see to manipulate my union – abt jeremie Jan 24 '18 at 12:16
Two unions that contain the same value in their last-stored members of the same type may have different bytes in them, so `memcpy` will report they are different. E.g., if two unions had 1234 and 5678 store in their `int` members, and then 90 is store in their `char` members, some of the old data from the `int` members may persist inside the union. – Eric Postpischil Jan 24 '18 at 12:52
@Eric Postpischil: I'm not sure and however I think this depend on the compiler. – LucaG Jan 24 '18 at 13:23
(Correction: my `memcpy` above should be `memcmp`, of course.) I am sure; this is bad code and should never be used. The C standard does not even guarantee that equal values of ordinary objects have the same underlying bits (as noted in footnote 52 of C 2011 [N1570]), let alone unions. – Eric Postpischil Jan 24 '18 at 14:47
Ok. To prevent this one is sufficient fill to 0 all unions before to use it: `memset(&union1,0,sizeof(all_integer));` and `memset(&union2,0,sizeof(all_integer));` – LucaG Jan 24 '18 at 15:16
@LucaG: That is not a solution. Per C 2011 [N1570] 6.2.6.1 7: “When a value is stored in a member of an object of union type, the bytes of the object representation that do not correspond to that member but do correspond to other members take unspecified values.” (Among other things, this allows a compiler to optimize by using an efficient store instruction to store a “word” in the union when the source code only stores a `char`. This may be faster on some processors but puts “garbage” in the union along with the `char`.) – Eric Postpischil Jan 24 '18 at 17:28
I think the correct solution has explained in this [link](https://stackoverflow.com/questions/12195794/what-is-the-correct-way-to-check-equality-between-instances-of-a-union) – LucaG Jan 25 '18 at 07:51

score 0 · Accepted Answer · answered Jan 24 '18 at 12:05

0

I think what you are trying to use "union" for is wrong. At compile time, all types must be resolved and so you can't compare unions with their active member being agnostic.

For me, the easiest (yet quite clean) solution here is to parse the numbers you want to print in the largest possible integer number (unsigned long long int in your case), for instance for your "%d":

unsigned long long int mask = ~((unsigned long long int) 0);
unsigned long long int container;
bool signed = true;
...
int num1 = -548375;
container = (unsigned long long int) (((long long int) num1) & mask); //"long long int" instead of "unsigned long long int" to propagate the bit of sign

Then you can pass your container as argument as well as the sign:

void my_print(unsigned long long int container, bool signed) {
    if (signed) {
        // Print as long long int
    } else if (!signed) {
        // Print as unsigned long long int
    }
}

This way, the function my_print can be generic, you only need to manage the conversion to "unsigned long long int" and the value of signed variable depending on the type.

answered Jan 24 '18 at 12:05

Benjamin Barrois

2,566
13
30

yes thanks, i think you've answered to my question, i was trying to do that with union because the definition of an union ( all the variable are the same and not take more place in memory than your current type of your union ) made sense in my mind for my problematic but yeah your solution is cool, i think i will rethink my way of doing my code – abt jeremie Jan 24 '18 at 13:14
Unfortunately you don't have the choice since types must be resolved at compile time. Of course, the "container" variable will often be much larger than needed, but having only one display function for unsigned long long int represents much more memory saving (smaller code). Don't forget to upvote my reply if you think the solution is good. – Benjamin Barrois Jan 24 '18 at 13:27
Using `unsigned long long int` as a container is problematic. The conversion of an `unsigned long long int` value to `long long int` is not defined by the C standard if the value is not representable in the new type. There is no guarantee it will wrap. – Eric Postpischil Jan 24 '18 at 14:51
hmm ok so i can maybe just try to solve that by the fact that i can know in advance if i'll use an insigned or signed int because of %u and %i are signed and the rest unsigned, then do one more function for signed long long and it's will be ok, a little less class but functional in any case – abt jeremie Jan 24 '18 at 15:58
@EricPostpischil Indeed the behaviour could be unexpected. In C++ you'd use static_cast<> to solve it. In C it is different. However it is doable with my method, but with an extra-control when converting from unsigned to signed (carefully managing the MSB). – Benjamin Barrois Jan 24 '18 at 16:08
@BenjaminBarrois: [Signed integer overflow is undefined in C++.](https://stackoverflow.com/questions/16188263/is-signed-integer-overflow-still-undefined-behavior-in-c) [`static_cast` does not fix it.](https://stackoverflow.com/questions/367633/what-are-all-the-common-undefined-behaviours-that-a-c-programmer-should-know-a/367662#367662) – Eric Postpischil Jan 24 '18 at 17:23
@EricPostpischil There is no notion of overflow while doing a `static_cast`. As both are the same wordlength, the binary value of the input is stored in the output, no matter what the bit vector represents. It is only if you try to read their interpreted values that they will be different (11111111111...11111) will be -1 for the signed version, and the maximum representable for the unsigned version. – Benjamin Barrois Jan 24 '18 at 18:17
@BenjaminBarrois: Per C++ [expr.static.cast], “The result of the expression static_cast(v) is the result of converting the expression v to type T.” Per [conv.integral] “If the destination type is signed, the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined.” (So not completely undefined as the Stack Overflow link I cited before suggests, but not portable.) It is **not** specified merely to copy or reinterpret the bits. – Eric Postpischil Jan 24 '18 at 18:25

how to compare two different union in C

3 Answers3