7

C99 -- specifically section 6.2.6.1, paragraph 4 -- states that copying an object representation into an array of unsigned char is allowed:

struct {
    int foo;
    double bar;
} baz;
unsigned char bytes[sizeof baz];

// Do things with the baz structure.

memcpy(bytes, &baz, sizeof bytes);

// Do things with the bytes array.

My question: can we not avoid the extra memory allocation and copy operation by simply casting? For example:

struct {
    int foo;
    double bar;
} baz;
unsigned char *bytes = (void *)&baz;

// Do stuff with the baz structure.

// Do things with the bytes array.

Of course, the size would need to be tracked, but is this legal in the first place, or does it fall into the realm of implementation-defined or undefined behavior?

I ask because I'm implementing an algorithm similar to qsort, and I'd like it to work for any array regardless of type, just as qsort does.

Flexo
  • 87,323
  • 22
  • 191
  • 272
  • AFAIK, `char` aliases everything, so this should be OK. – EOF Jul 13 '14 at 21:13
  • It depends. Usually you would be fine, but I know there's a flag for GCC which causes better optimisation by assuming that pointers to different types cannot be the same memory space. With casting like this, you wouldn't be able to take advantage of that. Unless/until this is a performance bottleneck, I'd avoid it. – Dave Jul 13 '14 at 21:14
  • @Dave -fstrict-aliasing? Doesn't hurt `char`. – EOF Jul 13 '14 at 21:15
  • 1
    The other way around would be bad. This way is ok. – Deduplicator Jul 13 '14 at 21:16
  • @EOF that's the one I was thinking of. So it's OK for char then? I can't find any documentation to check one way or the other... – Dave Jul 13 '14 at 21:18
  • Only if the underlying object is either anonymous memory or `struct` is a compatible type to the declared type. – Deduplicator Jul 13 '14 at 21:27

1 Answers1

6

6.5 Expressions

[...]
6 The effective type of an object for an access to its stored value is the declared type of the object, if any.87) If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.
7 An object shall have its stored value accessed only by an lvalue expression that has one of the following types:88)

  • a type compatible with the effective type of the object,
  • a qualified version of a type compatible with the effective type of the object,
  • a type that is the signed or unsigned type corresponding to the effective type of the object,
  • a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
  • a character type.

Emphasis mine. Thus, you are ok treating any type as an array of characters (unsigned char[], char[] or signed char[]).

I also quoted paragraph 6 because it makes the reverse not apply.

Deduplicator
  • 44,692
  • 7
  • 66
  • 118
  • In other words, so long as I can guarantee the data in a buffer of any character type matches exactly with the object representation of the given type I'm converting from/to, no copy operation is needed. Thank you for pointing out the relevant segments of the standard; you really clarified this point for me. –  Jul 13 '14 at 21:33
  • 1
    Not quite. If you put your data into an array with a declared type of `char[]`, you cannot refer to it as if that were structs. But if you used anonymous memory, you can. – Deduplicator Jul 13 '14 at 21:39
  • Anonymous memory would be memory allocated via `malloc` or another such function, right? So `struct_ptr = (void *)char_buffer;` isn't right because the _effective type_ of `char_buffer` is a character type, and that isn't a type that is compatible with the structure type. Or did I miss something? –  Jul 13 '14 at 21:56
  • 1
    @ChronoKitsune: That's the ticket, yes. Though if you use anonymous memory, beware of mis-allignment as well (the start of any block returned by `malloc` and the like is properly aligned for any type). – Deduplicator Jul 13 '14 at 21:58