0

In C, suppose I have a general purpose pointer which is used to store all data, In this example, size of four integers are allocated for the pointer content.

void * gen_purpose_ptr = SOME_MEMORY_ADDRESS;
gen_purpose_ptr = malloc(4*sizeof(int));

And There are two structure using this pointer to access their contents:

struct Struct_1 {
int data1, data2;
};

struct Struct_2 {
int data3, data4;
};

Now if i do

( (struct Struct_1 *)gen_purpose_ptr ) -> data1 = VALUE_A;
( (struct Struct_1 *)gen_purpose_ptr ) -> data2 = VALUE_B;
( (struct Struct_2 *)gen_purpose_ptr ) -> data3 = VALUE_C;
( (struct Struct_2 *)gen_purpose_ptr ) -> data4 = VALUE_D;

In this case, will the VALUE_A, VALUE_B, VALUE_C, VALUE_D be properly stored without being overwritten by each other?

When the casting and assignment operation is done, is the compiler aware of a specific memory can only be accessed by a specific structure member?

Jens Gustedt
  • 76,821
  • 6
  • 102
  • 177
Allen Kuo
  • 45
  • 1
  • 10
  • 4
    Please recap your C books chapter about pointers, memory allocation and structures. It's not exactly clear what you want to accomplish, but that approach is definitively very wrong. – too honest for this site Aug 06 '18 at 19:35
  • 1
    No, your code will not work as you intend. – Christian Gibbons Aug 06 '18 at 19:36
  • 3
    `(struct Struct_1 *)gen_purpose_ptr` and `(struct Struct_2 *)gen_purpose_ptr` have the same address. Because of this `data1 == data3 == VALUE_C` and `data2 == data4 == VALUE_D`, which is not what you intend. To summarize, yes data will be overwritten. – Fiddling Bits Aug 06 '18 at 19:42
  • 1
    I don't think C specification grants that `offsetof(struct Struct_1, data2) == offsetof(struct Struct_2, data4)`. Each structure may have different padding, even tho they have same members (I wonder, if such compiler exists). However we know that `offsetof(struct Struct_1, data1) == offsetof(struct Struct_2, data3) == 0` because the address of the first member is equal to the address of the struct. – KamilCuk Aug 06 '18 at 20:13
  • 1
    C lacks a truly general purpose pointer. A `void*` may be insufficient to encode a function pointer and vice-versa. – chux - Reinstate Monica Aug 06 '18 at 20:14
  • 1
    @KamilCuk You are right, however, but since FiddlingBits is correct as well, this does not really matter too much since even when only looking at `data1` and `data3`, the two values will overwrite each other in the example. By the way, you cannot safely cast a pointer to a `struct A` to a pointer to `struct B` cause this violates the strict aliasing rule ( https://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule ) for pointers imposed by the standard. IMHO. – Michael Beer Aug 06 '18 at 20:19
  • Does this really violates strict aliasing? It's a void pointer and with luck `4 *sizeof(int) > sizeof(struct Struct_1)` and `4 * sizeof(int) >= sizeof(struct Struct_2)`. Does reusing the same memory for different struct violates strict aliasing? It's not a pointer to `struct Struct_1`, it's a void pointer to some allocated storage. – KamilCuk Aug 06 '18 at 20:23
  • @KamilCuk, no this does not violate the effective type rule. Allocated storage (through `malloc`) has the effective type of the last write operation. Basically you are allowed to reuse a `malloc`ed region for a different purpose. But obviously the values will be overwritten. – Jens Gustedt Aug 06 '18 at 20:41
  • @KamilCuk You are right, it does not - now I am getting lazy in reading...sorry – Michael Beer Aug 06 '18 at 20:49
  • @KamilCuk So just to get this clear, if we reuse memory and not reinterpret it, it's not violating the strict aliasing rule? – Shahe Ansar Aug 06 '18 at 21:00
  • @JensGustedt: It violates the interpretation of the Effective Type rule used by gcc and clang. While one may debate whether that interpretation is conforming, I think it would be more helpful to treat such things as a Quality of Implementation issue. An implementation that's tailored for high-end number crunching may be totally unsuitable for systems programming; making it suitable for systems programming might make it less suitable for high-end number crunching. – supercat Aug 06 '18 at 21:56
  • @ShaheAnsar: Neither gcc nor clang can correctly handle all cases where storage gets reused within its lifetime. While it's vaguely conceivable that they might someday fix all aliasing bugs, I wouldn't hold my breath. Recognizing that a pointer or lvalue of one type which is freshly derived from another may be used to access the same storage as the other in the context where such derivation is visible, or in contexts where only the derived pointer is used, would eliminate the need for the "Effective Type" rules and all the tricky and unworkable corner cases assocated therewith. – supercat Aug 06 '18 at 22:03

2 Answers2

1

Let's see for ourselves!

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <string.h>

void printhex(const void *pnt0, size_t s)
{
    const unsigned char *pnt = pnt0;
    while(s--) {
        printf("%02x", *pnt++);
    }
}

void printhexln(const char *pre, const void *pnt, size_t s, const char *post)
{
    printf("%s", pre);
    printhex(pnt, s);
    printf("%s", post);
    printf("\n");
}

struct Struct_1 {
    unsigned int data1, data2;
};

struct Struct_2 {
    unsigned int data3, data4;
};

int main()
{
    // let's grab memory for 4 ints
    size_t size = 4 * sizeof(int);
    void * ptr = malloc(size);
    assert(ptr != NULL);
    // let's zero that memory
    memset(ptr, 0, size);
    // this will print zeros
    printhexln("1: ", ptr, size, "");

    ( (struct Struct_1 *)ptr ) -> data1 = 0x1122;
    printhexln("2: ", ptr, size, "");
    ( (struct Struct_1 *)ptr ) -> data2 = 0x3344;
    printhexln("3: ", ptr, size, "");
    ( (struct Struct_2 *)ptr ) -> data3 = 0x5566;
    printhexln("4: ", ptr, size, "");
    ( (struct Struct_2 *)ptr ) -> data4 = 0x7788;
    printhexln("5: ", ptr, size, "");

    free(ptr);
    return 0;
}

will output on https://www.onlinegdb.com :

1: 00000000000000000000000000000000
2: 22110000000000000000000000000000
3: 22110000443300000000000000000000
4: 66550000443300000000000000000000
5: 66550000887700000000000000000000

printhex is a simple function that prints the memory behind a pointer in hexadecimal characters. We can see that:

  1. first there are only zeros. We can count zeros and we see that size = 16 and sizeof(int) = 4.
  2. We then cast the pointer to struct Struct_1 and set data1 to 0x1122. The first 2 bytes of the pointer are overwritten and set to 0x2211, because the machine is little endian. The value of ((struct Struct_1)ptr)->data1 is now equal to 0x00001122.
  3. We can see that writing 4433 to ((struct Struct_1*)pnt)->data2 sets byte 5 and 6 to 0x4433. Machine is little endian, sizeof(int) = 4 and we can see that offsetof(struct Struct_1, data2) = 4
  4. Casting the struct to Struct_2 and writing to data3, overwrite first 2 bytes, not caring about the previous values. That's because offsetof(struct Struct_2, data3) = 0, so data3 starts at the beginning of the pointer.
  5. Well, 4433 is overwritten by 8877 when writing to the data4 member of Struct_2
KamilCuk
  • 120,984
  • 8
  • 59
  • 111
0

This question does not even arise if you follow a simply rule of style. Never put a type cast on an lvalue. You put all casts on the right hand side of assignments. Also, c compilers, in practice, will, unpredictably, generate buggy code if you type cast lvalues. C structures have alignment and padding properties associated. This tends to break when objects and libraries are linked. Objects in this context referred to the output of a compiler , NOT object in the sense of OOP.

sys101
  • 1
  • 3