1

Good Morning, I am trying to come up with a data structure which can be used in different applications, yet passed in to a transmit function as the same type, I am using netbeans at the moment but this will be transferred on to a dspic30f (16-bit),

typedef union {

    union {

        struct {
            unsigned bit0 : 1;
            unsigned bit1 : 1;
            unsigned bit2 : 1;
            unsigned bit3 : 1;
            unsigned bit4 : 1;
            unsigned bit5 : 1;
            unsigned bit6 : 1;
            unsigned bit7 : 1;

            unsigned bit8 : 1;
            unsigned bit9 : 1;
            unsigned bit10 : 1;
            unsigned bit11 : 1;

            union {

                struct {
                    unsigned bit12 : 1;
                    unsigned bit13 : 1;
                    unsigned bit14 : 1;
                    unsigned bit15 : 1;
                };
                unsigned char value;
            } lastfour;

        };
        unsigned int value : 16;
    };

    union {

        struct {

            union {

                struct {
                    unsigned bit0 : 1;
                    unsigned bit1 : 1;
                    unsigned bit2 : 1;
                    unsigned bit3 : 1;
                };
                unsigned char value;
            } firstfour;

            unsigned bit4 : 1;
            unsigned bit5 : 1;
            unsigned bit6 : 1;
            unsigned bit7 : 1;

            unsigned bit8 : 1;
            unsigned bit9 : 1;
            unsigned bit10 : 1;
            unsigned bit11 : 1;
            unsigned bit12 : 1;
            unsigned bit13 : 1;
            unsigned bit14 : 1;
            unsigned bit15 : 1;


        };
        unsigned int value : 16;
    };

} foo;

I then use the following code to check the functionality.

int main(int argc, char** argv) {

     foo a;
     a.value =0;
     a.lastfour.value = 0xF;

     printf("%d", a.value);

     return (EXIT_SUCCESS);
 }

The printed value is 0, however because of the union I am under the impression the two structure share the same memory (16 bits) so after setting 'lastfour' to 0xF 'value' should now be 0xF000.

Could anyone give some guidance on what I am doing wrong and why 'value' is not reading the same memory which contains 'lastfour'

2 Answers2

3

First, I'm surprised this even compiles for you. You have two anonymous unions in your foo type, and they have duplicate member names (bit4, bit5, etc.). Your code didn't compile for me. You should provide names for the two unions or rename the bits so they don't conflict.

Second, your unions firstfour and lastfour will likely end up being 8 bits, not 4, since the minimum size of a char is 8 bits. That's going to throw off all your other bits.

Third, your unions firstfour and lastfour will not start at bit 12 in memory. They will be aligned as necessary for your processor, likely at the next 2-byte or 4-byte offset. Try printing sizeof(foo) in your function. I guarantee you'll see something like 4 or 8, not 2 like you're expecting.

Fourth, that larger size is why you're seeing the value "0" printed in your test code. The first 16 bits are all zero. The 0xF you set is either in the next 16 bits or possibly in the next 32 bits, depending how your compiler aligned things.

Here is a structure layout that should work for what you're trying to do. I tested it and it works for me. Packs everything into 2 bytes.

typedef struct {
    union {
        struct {
            uint16_t firstfour  : 4;
            uint16_t secondfour : 4;
            uint16_t thirdfour  : 4;
            uint16_t lastfour   : 4;
        };
        /* EDIT - Duplicate structure with different member names
           added, in response to a comment below. */
        struct {
            uint16_t nibble1    : 4;
            uint16_t nibble2    : 4;
            uint16_t nibble3    : 4;
            uint16_t nibble4    : 4;
        };
        struct {
            uint16_t bit0  : 1;
            uint16_t bit1  : 1;
            uint16_t bit2  : 1;
            uint16_t bit3  : 1;
            uint16_t bit4  : 1;
            uint16_t bit5  : 1;
            uint16_t bit6  : 1;
            uint16_t bit7  : 1;
            uint16_t bit8  : 1;
            uint16_t bit9  : 1;
            uint16_t bit10 : 1;
            uint16_t bit11 : 1;
            uint16_t bit12 : 1;
            uint16_t bit13 : 1;
            uint16_t bit14 : 1;
            uint16_t bit15 : 1;
        };
        uint16_t value;
    };
} foo;
Andrew Cottrell
  • 3,312
  • 3
  • 26
  • 41
  • Thanks, some really useful stuff there, however If I had different declarations for firstfour secondfour ...etc (e.g. nibble1 nibble2 ...) that i still wanted to access the same memory space – Chef Dan Burns Mar 24 '15 at 09:37
  • Not exactly sure what you're asking, but if you mean you want to access "firstfour" with the name "nibble1" you could add another structure within the union and use those names. Basically copy-paste the first structure and rename the fields. – Andrew Cottrell Mar 24 '15 at 18:15
  • Edited my answer to add the nibble* struct, which is what I believe you're asking for. – Andrew Cottrell Mar 24 '15 at 18:23
  • Brilliant thank you, That is the conclusion I came to yesterday. I was about to update my post, so as long as each structure is processor word bits long it'll align correctly and all will be good. – Chef Dan Burns Mar 26 '15 at 08:48
  • I've been corrected instead of processor word bits it should be byte, I believe – Chef Dan Burns Mar 26 '15 at 11:09
2

It is implementation defined (depends upon the size of int-s, the processor, the endianness, the ABI, etc...). It certainly would be different on an Android tablet with an ARM processor and on an x86-64 desktop running some 64 bits flavor of Linux (distribution).

I believe you should avoid union-s with bitfields in struct unless you are thinking of a particular implementation.

I am not sure that your code enables you to call arbitrary function (in particular, because pointers might have different sizes than int-s; you might want to use intptr_t), but this has not much in common with your code.

If you want to be able to call an arbitrary function of arbitrary signature consider using some library like libFFI (which is of course implementation specific).

Notice that bitfields are implementation specific and are not very efficient (in terms of access time). For software running on desktops or laptops they are almost always useless. They are more useful in implementation specific low-level embedded code (e.g. the microcontroller inside your washing machine), and then you should know what your implementation (including your compiler) is precisely doing.

BTW, your code is wrong since lastfour contains a char (generally an 8 bits byte) so it cannot take the same place as a 4-bits bitfield (bits12 ... bits15); maybe you should replace unsigned char value; in your firstfour with unsigned valfourbits : 4; etc...

To pass some dynamically typed data to some function, you might want to have some tagged union. The Glib GVariant type is a real-world example (and you might dive into the source code).

If you know a bit of assembler, you could try to look into the assembler code generated by your compiler. If compiling with GCC, try to compile your program using gcc -Wall -fverbose-asm -O -S your-main.c then look (with an editor or pager) into the generated assembler code in your-main.s

Notice that (assuming you don't use the register keyword, which has become obsolete) every data -variable or aggregate field or array component- is addressable (you can use the address-of unary prefix & operator) and in practice may sit in memory, as consecutive bytes, with some alignment constraint. However, bitfields (and register variables) are an exception. They are not addressable, and bitfields usually sit inside some addressable memory zone, in a implementation-specific way.

The general rule of thumb is to avoid bitfields. That rule has exceptions, but you should first learn a lot more about C.

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • Sorry I forgot to add this, it's for a 16bit PIC, as I understand the ':4' defines the length of the char. Essentially I have function(dataPacket* a) { transmit(a.value); } but I want to fill the packet with different values used in different bit chunks. – Chef Dan Burns Mar 20 '15 at 12:02
  • On almost every 16 bits microcontroller I head about in this century, a `char` is an 8 bits byte (I never heard of 4 bits `char`-s). And please edit your question to improve it! – Basile Starynkevitch Mar 20 '15 at 12:05
  • I found [this](http://stackoverflow.com/questions/3305933/use-of-the-operator-in-c) to explain the 4 bit char – Chef Dan Burns Mar 20 '15 at 12:09
  • 1
    :4 defines a bitfield. It *doesn't* define the length of a char! – Brian Sidebotham Mar 20 '15 at 12:49
  • Sorry I had been pulled away from this for a few days, my understanding is the bit field is the length of memory allocated to the variable, is that not correct ? – Chef Dan Burns Mar 24 '15 at 08:50
  • 1
    It is not correct. In practice, every data which is not a bitfield (or a `register` variable in C) is addressable, occupies some consecutive zone of bytes with some alignment constraint. A bitfield data does not occupy a byte zone (but is *part* of some addressable memory zone) and is not addressable. – Basile Starynkevitch Mar 24 '15 at 08:52
  • I see, so no matter what I do a structure will take up 1 processor word worth of bits. – Chef Dan Burns Mar 25 '15 at 08:42
  • @ChefDanBurns: no a `struct`can be smaller than a word, e.g. `struct short_st { char c; char d; }` is 2 bytes on my Linux/x86-64 machine (where a word is 8 bytes). – Basile Starynkevitch Mar 25 '15 at 08:44
  • but will essentially use the same memory space as a word ? so if you access it elsewhere without pre-defining the empty space you'll look at the null space too, which will not be what you expect. I think I've got it, Thank you, please say if what I have just said is not correct. – Chef Dan Burns Mar 26 '15 at 08:54
  • @ChefDanBurns: you are wrong, since what you said is not true on every machine/OS. On my Linux/x86-64 a *machine word* is 64 bits i.e. 8 bytes. But I can have some `struct` with only 2 bytes (like `struct short_st { char c; char d; };`). Memory is *byte addressable* but `struct` and any other data have *alignment constraints* – Basile Starynkevitch Mar 26 '15 at 08:56
  • Ok, so if you exchange where I use 'word' with 'byte' then it is true ? I think i'm getting confused as I am using a 16 bit processor. – Chef Dan Burns Mar 26 '15 at 11:08
  • 1
    It is implementation specific. Some weird 16 bits processors are not able to address bytes, but only 16 bits words. Probably their `char` is 16 bits also. – Basile Starynkevitch Mar 26 '15 at 11:09