7

Context:

I was reviewing some code that receives data from an IO descriptor into a character buffer, does some control on it and then use part of the received buffer to populate a struct, and suddenly wondered whether a strict aliasing rule violation could be involved.

Here is a simplified version

#define BFSZ 1024
struct Elt {
   int id;
   ...
};

unsigned char buffer[BFSZ];
int sz = read(fd, buffer, sizeof(buffer)); // correctness control omitted for brievety

// search the beginning of struct data in the buffer, and process crc control
unsigned char *addr = locate_and_valid(buffer, sz);

struct Elt elt;

memcpy(&elt, addr, sizeof(elt)); // populates the struct

// and use it
int id = elt.id;
...

So far, so good. Provide the buffer did contain a valid representation of the struct - say it has been produced on same platform, so without endianness or padding problem - the memcpy call has populated the struct and it can safely be used.

Problem:

If the struct is dynamically allocated, it has no declared type. Let us replace last lines with:

struct Elt *elt = malloc(sizeof(struct Element)); // no declared type here

memcpy(elt, addr, sizeof(*elt)); // populates the newly allocated memory and copies the effective type

// and use it
int id = elt->id;  // strict aliasing rule violation?
...

Draft n1570 for C language says in 6.5 Expressions §6

The effective type of an object for an access to its stored value is the declared type of the object, if any.87) If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one.

buffer does have an effective type and even a declared type: it is an array of unsigned char. That is the reason why the code uses a memcpy instead of a mere aliasing like:

struct Elt *elt = (struct Elt *) addr;

which would indeed be a strict aliasing rule violation (and could additionaly come with alignment problems). But if memcpy has given an effective type of an unsigned char array to the zone pointed by elt, everything is lost.

Question:

Does memcpy from an array of character type to a object with no declared type give an effective type of array of character?

Disclaimer:

I know that it works without a warning with all common compilers. I just want to know whether my understanding of standard is correct


In order to better show my problem, let us considere a different structure Elt2 with sizeof(struct Elt2)<= sizeof(struct Elt), and

struct Elt2 actual_elt2 = {...};

For static or automatic storage, I cannot reuse object memory:

struct Elt elt;
struct Elt2 *elt2 = &elt;
memcpy(elt2, &actual_elt2, sizeof(*elt2));
elt2->member = ...           // strict aliasing violation!

While it is fine for dynamic one (question about it there):

struct Elt *elt = malloc(sizeof(*elt));
// use elt
...
struct Elt2 *elt2 = elt;
memcpy(elt2, &actual_elt2, sizeof(*elt2));
// ok, memory now have struct Elt2 effective type, and using elt would violate strict aliasing rule
elt2->member = ...;        // fine
elt->id = ...;             // strict aliasing rule violation!

What could make copying from a char array different?

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • The difference between `struct Elt elt;` and `struct Elt *elt = malloc(sizeof *elt);` is only the location of the memory for the structure. That's about it. It doesn't matter how the memory for the structure was allocated, or where it is located, both structures are equally valid. – Some programmer dude Feb 01 '18 at 10:01
  • @Someprogrammerdude: That's what I had thought for decades, but the declared type vs. effective type question scares me... – Serge Ballesta Feb 01 '18 at 10:04
  • Your edit doesn't make any sense. `struct Elt2 *elt2 = elt;` is not valid C. Your original question didn't have any constraint-violating, wild pointer conversions. I have no idea what you are asking any longer. Voting to close as unclear. – Lundin Feb 01 '18 at 10:54
  • @Lundin: You are right, it is a different question. I asked it [there](https://stackoverflow.com/q/48561541/3545273) – Serge Ballesta Feb 01 '18 at 11:24

2 Answers2

10

The code is fine, no strict aliasing violation. The pointed-at data has an effective type, so the bold cited text does not apply. What applies here is the part you left out, last sentence of 6.5/6:

For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.

So the effective type of the pointed-at object becomes struct Elt. The returned pointer of malloc does indeed point to an object with no delcared type, but as soon as you point at it, the effective type becomes that of the struct pointer. Otherwise C programs would not be able to use malloc at all.

What makes the code safe is also that you are copying data into that struct. Had you instead just assigned a struct Elt* to point at the same memory location as addr, then you would have a strict aliasing violation and UB.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • The problem, is that if I write `struct Elt2 elt2 = elt; memcpy(elt2, true_elt2_here);` where `Elt2` is a different struct that fits in the allocated memory (`sizeof(struct Elt2)` <= `sizeof(struct Elt)`), the zone will actually contain an Elt2 object, and the original `elt` will be lost. That is the reason for allocated memory having no declared type. My question is indeed: is memcpy from a char array different? – Serge Ballesta Feb 01 '18 at 10:31
  • @SergeBallesta Your question is missing something as `struct Elt2 elt2 = elt;` would be invalid initialization. Better if you add clarifications to your question. – user694733 Feb 01 '18 at 10:44
  • @user694733: I have just edited my question with it. – Serge Ballesta Feb 01 '18 at 10:47
  • 1
    @SergeBallesta That example doesn't make any sense. You cannot assign objects of different types. Object having no declared type means that you somehow have a void pointer, pointing at raw data. In general, pointer aliasing does not apply at all whenever you make hard copies of data. – Lundin Feb 01 '18 at 10:51
  • 1
    Because malloc() returns a pointer to memory that is suitably aligned for any type, the memcpy of the struct into that buffer is quite safe as long as you treat is as whatever struct type you copied from. This might not be true of some char buffer on the stack however. In any case, strict aliasing is not a concern here. – jwdonahue Feb 06 '18 at 23:45
0

Lundin's answer is correct; what you are doing is fine (so long as the data is aligned and of same endianness).

I want to note this is not so much a result of the C language specification as it is a result of how the hardware works. As such, there's not a single authoritative answer. The C language specification defines how the language works, not how the language is compiled or implemented on different systems.

Here is an interesting article about memory alignment and strict aliasing on a SPARC versus Intel processor (notice the exact same C code performs differently, and gives errors on one platform while working on another): https://askldjd.com/2009/12/07/memory-alignment-problems/

Fundamentally, two identical structs, on the same system with the same endian and memory alignment, must work via memcpy. If it didn't then the computer wouldn't be able to do much of anything.

Finally, the following question explains more about memory alignment on systems, and the answer by joshperry should help explain why this is a hardware issue, not a language issue: Purpose of memory alignment

cegfault
  • 6,442
  • 3
  • 27
  • 49