I cant figure out why do we need to declare 'q' variable along with a union type name 'quantity'?
As presented in the question, struct goods
is a structure type with two members: an array of 20 char
identified by name
and a union quantity
identified by q
(so yes, quantity
is a union tag, not the name of a member). There is no need in any absolute sense to declare it that way, but such a declaration provides a few characteristics that other alternatives do not. Do understand, however, that as declared in the example, count
, weight
, and volume
are not members of struct goods
. Rather, they are members of q
, a union that is a member of struct goods
.
Why can't we get away with just 'quantity' and then access struct fields via dot?
Because that is not one of the alternatives that C syntax affords. In the member list of a structure type declaration, a union tag (quantity
in this case) can appear only in the declaration of a named member, so if that is provided then you must also declare an identifier for the union -- q
in the example. And having declared the union as a named member, you must access its members via the union's identifier.
On the other hand, you may omit the tag, and if you do, then, optionally, you may also omit the union's identifier. If you do omit the identifier (and only in that case), you have an "anonymous union member" whose own members are accessed as if they were actually members of the containing structure. That's pretty close to what you ask.
Do note that any way around, the members of the union share storage with each other, so the union contains only one of them at any given time. They do not share storage with other members of the containing structure.
With that said, the various options do have some differences in their characteristics. In the first place, do appreciate that all of these forms have twofold significance: they declare a union type, and they declare a structure member of that type. That's relevant because if you provide a tag then you can declare other objects of the same union type wherever the union declaration is in scope. Moreover, that scope is not limited to the structure type declaration that contains it, so with the declaration presented, one could do something like this:
void set_quantity(struct goods *g, union quantity quant) {
g->q = quant;
}
That is impossible for untagged unions.
There is also at least one important distinction between a named member with an untagged union type and an anonymous union member: you can access the union itself only if it is named. Consider this:
struct goods2 {
char name[20];
union {
int count;
float weight, volume;
} q;
};
void copy_quantity(struct goods2 *dest, struct goods2 *src) {
dest->q = src->q;
}
Not only can you not do that with an anonymous union member, you cannot do anything reliably equivalent. In particular, even if you were willing to suffer the inefficiency that would be associated with copying src->count
, src->weight
, and src->volume
individually despite only one of them actually containing a value, C provides no promise that doing so in any order would reliably achieve the desired result.