3

I have some code that is very similar to the following

struct T {
    union {
        unsigned int x;
        struct {
            unsigned short xhigh;
            unsigned short xlow;
        };
    } x;
    /* ...repeated a handful of times for different variables in T... */
};

This does exactly what you'd expect: It allows me to declare a variable of type struct T and access either t.x.x or t.x.xhigh or t.x.xlow. So far so good.

However, I would really like it if I could do just t.x in the common case of wanting to access the value of the union as an unsigned int quantity, but retain the ability to access the high- and low-order portions independently without resorting to bit masking and shifting, and without invoking undefined behavior.

Is that possible in C?

If it is possible, then what is the C syntax for the declaration?

When I try the naiive approach of simply accessing t.x instead of t.x.x, I get warning messages like (this particular one is from a printf() call):

cc -ansi -o test -Wall test.c
test.c: In function ‘my_function’:
test.c:13:2: warning: format ‘%X’ expects argument of type ‘unsigned int’, but argument 2 has type ‘const union <anonymous>’ [-Wformat]

Using -std=c11 instead of -ansi yields the same warning.

user
  • 6,897
  • 8
  • 43
  • 79
  • http://stackoverflow.com/a/22190086/2864275 looks like you need c11 – Iłya Bursov May 17 '17 at 04:44
  • The use of the compiler option `-ansi` is questionable. Why are you using 1980s style C programming? See http://stackoverflow.com/questions/17206568/what-is-the-difference-between-c-c99-ansi-c-and-gnu-c-a-general-confusion-reg/17209532#17209532 – Lundin May 17 '17 at 07:47
  • @Lundin In this case, `-ansi` or `-std=c11` doesn't matter; both yield the same result. (I also moved away from `-ansi` rather soon after I posted this, for other reasons specifically related to more recent language constructs.) I discussed that briefly in the comments to InternetAussie's answer, but I have now also updated the question accordingly. – user May 17 '17 at 07:54
  • 1
    I believe it depends on your GCC version. `-ansi` expands to "almost C90" on older versions of GCC and "almost C11" on newer versions. It is better to use "guaranteed fully compliant C11", which is `gcc -std=c11 -pedantic-errors`. Adding `-Wall -Wextra` doesn't hurt. – Lundin May 17 '17 at 08:02
  • @Lundin ...and even with all of those ( `gcc -std=c11 -Wall -Wextra -pedantic-errors`), my code now compiles cleanly except for warnings that I'm not doing anything with `argc` and `argv` in my `main()` (which at this point is really just a test routine, but will almost certainly grow later on). FWIW, that's with gcc 4.7.2; not the most recent version, but seemingly recent enough. – user May 17 '17 at 08:05

1 Answers1

7

Anonymous unions are a thing, if you can use anonymous structs (they are both C11 features or compiler extensions).

Just as you've used a struct with no name to inject its members into the union's namespace, so you can also use a union with no name to inject its members into the enclosing namespace. Like so:

struct T {
    union {
        unsigned int x;
        struct {
            unsigned short xhigh;
            unsigned short xlow;
        };
    }; /* <-- no name here */

    /* ...repeated a handful of times for different variables in T... */
};

You just have to make sure that none of the injected names clash with other injected names or regular names that are there, otherwise it won't compile.


One concern though: you seem to be relying on the "fact" that unsigned short is half the size of unsigned int, and that these types are big-endian. But if that's what happens on your system, then that's fine. If not, I suggest you rethink the structure.

  • This looks like it's working nicely even with my GCC in `-ansi -Wall` mode, let alone `-std=c11 -Wall`, thank you! – user May 17 '17 at 04:56
  • As for your note on the relative sizes of `int` and `short`, I am aware that there is an underlying assumption hidden in plain sight there, but so far I'm just poking around. I will probably migrate this to specific types of a given size, but for now I'm just trying to see if I can get what I am doing to work at all. – user May 17 '17 at 04:58
  • Another concern is *padding*. Depending on the implementation, you may need to explicitly specify that no padding be inserted by the compiler in your inner struct. I'm not sure whether this will be handled automatically by your compiler as the natural alignment for `int` and `struct` may not be the same. You may want to look into what command line option your compiler provides or look into `#pragma pack` if you run into weird issues. – David C. Rankin May 17 '17 at 05:32
  • @DavidC.Rankin Thanks for the hint, I'll keep that in mind. For the time being, keep in mind that what I posted in the question isn't *exactly* the code I actually have; there are a few differences that would be relevant for the cases raised by both you and InternetAussie, but none that were relevant *to the question I was asking* which was about accessing a member of the type. – user May 17 '17 at 05:41
  • For gcc, unnamed unions were a compiler extension before C11. Note that there was a bug involving use of designated initializers with unnamed unions; [here is a bug report](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=10676), and [here is an answer I gave to a related question](http://stackoverflow.com/a/43103933/6879826). In some cases, extra braces were needed to make designated initializers work, but this was finicky, and did not always work. The fix was supposed to have occurred in gcc 4.6. – ad absurdum May 17 '17 at 05:42