2

It's sometimes necessary to cast a data structure into a pointer so that the data can be sent, for example, over an interface, or written out to some other stream. In these cases, I usually do something like this:

typedef struct {
  int    field1;
  char   field2;
} testStruct;

int main()
{
  char *buf;
  testStruct test;

  buf = (char *)&test;

  // write(buf, sizeof(test)) or whatever you need to do

  return 0;
}

Recently in some microprocessor code, however, I saw something similar to this:

typedef struct {
  int    field1;
  char   field2;
} testStruct;

int main()
{
  char buf[5];
  testStruct test;

  *(testStruct *)buf = test;

  // write(buf, sizeof(test)) or whatever you need to do

  return 0;
}

To me, the former feels a little more safe. You just have one pointer, and you assign the address of the structure to the pointer.

In the latter case, it seems like if you allocate the wrong size to the array buf by accident, you'll end up with undefined behavior, or a segfault.

With optimizations on, I get a -Wstrict-aliasing warning from gcc. However, again, this code runs on a microprocessor, so is there something I might be missing there?

There's no pointers in the structures, or anything, it's very straight forward.

Steve Summit
  • 45,437
  • 7
  • 70
  • 103
justynnuff
  • 461
  • 1
  • 6
  • 20
  • Two questions before yours there was this one: https://stackoverflow.com/questions/48571295/difference-between-memcpy-and-copy-by-assignment – Eugene Sh. Feb 01 '18 at 21:43
  • This question has nothing to do with `memcpy` or deep copies in general. It's more about the correct way to cast a data structure to a buffer. – justynnuff Feb 01 '18 at 21:45
  • If you read the question, answer and comments carefully, you will see that *there is no correct way to cast*. – Eugene Sh. Feb 01 '18 at 21:46
  • 1
    `(testStruct *)buf` may generate a mis-aligned address for a `testStruct` leading to a bus fault. Do not use. A `union` is better. – chux - Reinstate Monica Feb 01 '18 at 21:47
  • @EugeneSh. I don't think you understand the question, answer, or what you're even saying in relation to that other post. doing `buf = (char *)&testStruct;` is perfectly valid. Saying "there's no correct way to cast" is really an ignorant statement. – justynnuff Feb 01 '18 at 21:57
  • @chux If you can expand upon that in an answer, I can look into it and accept it. – justynnuff Feb 01 '18 at 21:58
  • 2
    `write` and friends use a `void *` as a buffer argument so no cast is needed. `write(fd, &test, sizeof(test))` is perfectly OK, as long as you accounted for platform differences. `*(testStruct *)buf = test;` wouldn't pass a code review. – user58697 Feb 01 '18 at 22:00
  • You are not asking about `buf = (char *)&testStruct` but about `*(testStruct *)buf = test;` which is a violation of strict aliasing rule, which might lead to undefined behavior. If you come here for answers and get some that you don't like, it is your problem and not of the answer. – Eugene Sh. Feb 01 '18 at 22:02
  • 1
    Besides running the reasonably grave risk of misaligned data accesses, `*(testStruct *)buf = test` is a bad idea (actually a pretty stupid idea, I'd say) because it needlessly copies data. If you have `test`, but you just want to temporarily treat it as a blob o' bytes, then the equivalent of `buf = (unsigned char *)&test` is just what you want. (Or if you really want to copy data, call `memcpy`. But a cast and a pretend struct assignment is just a bad idea, a holdover from the rough, roguish early days of C.) – Steve Summit Feb 01 '18 at 22:05

1 Answers1

2

(testStruct *)buf may generate a mis-aligned address for a testStruct leading to a bus fault. Do not use.

A union is better. It helps cope with anti-aliasing issues as well as alignment ones.

Also see @Steve Summit's good comment.

Consider a master type like testStruct_all.

typedef struct {  // OP's structure
  int    field1;
  char   field2;
} testStruct1;

typedef struct {  // Perhaps another structure to send
  double field1;
  char   field2;
} testStruct2;

// A union of all possible structures used in this app
typedef union {
  testStruct1 tS1;
  testStruct2 tS2;
  char buf[1]; 
} testStruct_all;

int main(void) {
  testStruct_all ux; 
  foo(&ux.tS1);  // populate ux.tSn of choice.

  write(ux.buf, sizeof ux.tS1);

  read(ux.buf, sizeof ux.tS1);
  // the union insures alignment and avoids AA issues
  bar(&ux.tS1);
  return 0;
}

write() usually accepts a void * @user58697, so code could drop the buf member and use:

  write(&ux, sizeof ux.tS1);  //  or whatever you need to do
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • `sizeof(double) != sizeof(int)` on most systems. Did you mean something like `int64_t` and `double`? Or were you intentionally pointing out how this approach resolves references to `struct`s with different sizes? – Patrick Roberts Feb 01 '18 at 22:22
  • Ignore the write() call. It's a microprocessor, what if I assign the pointer to a SPI tx register and latch it out with interrupts or something? It was more about the deserialization/casting than the call it's going in to. – justynnuff Feb 01 '18 at 22:22
  • @PatrickRoberts Yes, the idea was to show how the `struct`s, `char[]` may differ in size and alignments needs yet with a union, all is aligned and as a union by-passes AA issues. – chux - Reinstate Monica Feb 01 '18 at 22:24
  • @justynnuff `(testStruct *)buf` can easily fail a uP that requires an `int` on even boundaries, yet allows a `char[]` to exists on even/odd boundaries. – chux - Reinstate Monica Feb 01 '18 at 22:28
  • You're question is a correct answer. But I keep thinking that whoever wrote this code wasn't that stupid. By doing `*(testStruct *)buf = test;` into a buf that has been allocated memory, I wonder if that was a way that they were doing memcpy on a uP/compiler that maybe didn't have memcpy at the time? – justynnuff Feb 01 '18 at 22:38
  • @justynnuff Code like `*(testStruct *)buf = test;` is _usually_ OK (and fast) for select conditions and older compilers vs. using `memcpy()`. Newer, and amazingly sneaky new compliers make very good code yet take advantage of UB. If code that "worked" in the past relied on a beneficial UB, that code is brittle going forward. New compilers often analyze standard functions usage like `memcpy()` and replace with optimal inline code. Gone are the days where loops like `for (int i=0; i>=0; i++);` can be assured to terminate. Avoid UB. – chux - Reinstate Monica Feb 01 '18 at 22:46