-2

My curiosity is influenced by a code with the likes of:

struct tree
{
    unsigned char apple, leaf;
};

int main(void)
{
    void* arr[2] = {(int*)1, (int*)2};
    struct tree* myStruct = (struct tree*)arr;

    return 1;
}

..Which logically tries to convert array to a structure and throws no warnings.

Is that the way I convert array to a structure?

Corelation
  • 79
  • 7

3 Answers3

4

Which logically tries to convert array to a structure and throws no warnings.

There is nothing logical behind that code. More correctly, the code nonsensically tries to convert an array to a structure. However, C allows a lot of nonsense and when you invoke undefined behavior, everything can happen.

This is what happens behind the lines:

  • void* arr[2] is an array of two pointers, which will have the same size as the address bus of the given system. Let us assume that the address bus is 32 bits.
  • {(int*)1, (int*)2}; then takes two integer literals and cast each of them to a pointer. This is fine in C. So we have int two pointers with addresses 0x00000001 and 0x00000002 respectively.
  • The int pointers are then stored in the void pointer array without problems, since there is no explicit cast needed between a void pointer and another type of pointer.
  • Then (struct tree*)arr wildly casts the array into a struct type. This breaks the so-called "strict aliasing rule" and this is undefined behavior. Here your program may very well crash and burn, because there are several potential problems.
  • The alignment of the data members of the struct is not necessarily compatible with the alignment of the pointer variables, and there may be padding bytes in the struct. - Furthermore there is nothing saying that the struct is smaller than the 2 pointers. In case it is larger, the program will attempt to access memory out-of-bounds as it reads past the end of the array.
  • Also the representation of char on the given system may be anything. They may be 8 bits, they may be 16 bits, they may or may not come with a sign bit.
  • With some luck, the undefined behavior on your given machine could turn out like this: Lets assume characters are 8 bit unsigned and again that pointers are 32 bits. Lets assume there is no padding or alignment issues. Lets assume that the program doesn't crash and burn when you execute the code. In that case, the struct would take the 2 first encountered bytes of the pointer array and treat them as data. Since the array constains data 0x0000000100000002, the two first bytes is 0x00 and 0x00. apple and leaf would then contain the values 0 and 0 respectively. But this is also endianess-dependent, which byte in the pointer address that goes where depends on whether your machine uses little endian or big endian.

Is that the way I convert array to a structure?

No, this is very bad code. There is nothing correct about it. Never write code like this.


So what is the correct way of converting an array to a struct?

Simply do:

char arr[2] = {1, 2};
myStruct.apple = arr[1];
myStruct.leaf  = arr[2];

That's the only 100% bullet-proof way. If you wish to use memcpy() or similar, to reduce the number of manual assignments, you have to write defensive programming to protect yourself against struct padding:

static_assert(sizeof(arr) == sizeof(myStruct), "ERR: struct padding detected");

memcpy(&myStruct, arr, sizeof(myStruct));
Community
  • 1
  • 1
Lundin
  • 195,001
  • 40
  • 254
  • 396
  • Can't I use `#pragma pack` to avoid padding? Also.. I am trying to return an array from a function, that could be then assigned to both - `struct` and `array` type variables. – Corelation Nov 13 '14 at 14:44
  • @Corelation Yes `#pragma pack` will work but it is not standard. `static_assert` is standard and therefore 100% portable, though you might need a compiler with C11 support. – Lundin Nov 13 '14 at 15:15
1

You defined the variable arr as void *. In C a void pointer is used to define a generic type. You can cast void * to everything you want. Thats why there is no warning.

However keep in mind that a pointer on a modern 32/64 bit system has a size of 4 or 8 bytes. Your struct tree is smaller than the array arr.

I think you want it like that (works on 32 bit systems):

#include <stdio.h>

struct tree
{
    // use int instead of char because char is only 1 byte
    // if 32 bit system
    unsigned int apple, leaf;
    // if 64 bit system
    // unsigned long apple, leaf;
};

int main(void)
{
    void* arr[2] = {(int*)1, (int*)2};
    struct tree* myStruct = (struct tree*)arr;
    // if 32 bit system
    printf("%d\n", myStruct->apple);
    printf("%d\n", myStruct->leaf);
    // if 64 bit system
    //printf("%lu\n", myStruct->apple);
    //printf("%lu\n", myStruct->leaf);
    return 1;
}

This example does not work on every system and has undefined behaviour because of the struct member alignment. Read the post of Lundin for more information.

code monkey
  • 2,094
  • 3
  • 23
  • 26
  • -1 for "works on 32 bit systems". This is completely undefined behavior and there are no guarantees it will work on any system. – Lundin Nov 13 '14 at 14:25
  • What's your problem? That's why i wrote that the code is platform dependant. If you want to do it totally platform independant you have to use the preprocessor and check if `sizeof(void*)` is 2, 4, 8 or whatever. But that was not the question of the author and would bloat the answer. It's not undefined behaviour. – code monkey Nov 13 '14 at 14:32
  • Still the compiler is free to toss in any number of padding bytes into the struct. And the size of a pointer is not necessarily the size of an int, no matter how large the pointer is. This is undefined behavior, and not just implementation-defined behavior. – Lundin Nov 13 '14 at 14:37
  • `"Still the compiler is free to toss in any number of padding bytes into the struct"`. That's true and I have overlooked it. However I never wrote that a pointer has a size of an `int`... – code monkey Nov 13 '14 at 14:46
0
#include <stdio.h>

struct tree
{
    char apple, leaf;
};

int main(void)
{
    char arr[sizeof(struct tree)] = {'1', '2'}; //array of pointers it is ugly
    //use char or unsigned char, and size of array must be "sizeof(struct tree)"
    //cause struct may be alligned
    struct tree* myStruct = (struct tree*)arr;

    printf("%p\n", myStruct);
    printf("%c\n", myStruct->apple);
    printf("%c\n", myStruct->leaf);
    return 1;
}

it is works fine

Ivan Ivanovich
  • 880
  • 1
  • 6
  • 15