1

A friend of mine posted this problem on Facebook the other day, and for the life of me I can't figure it out. He's writing a client and server using the cubesat protocol. For some reason when he casts the data member of the protocol struct to a pointer, his data appears to be mangled.

Client code snippet:

uint32_t data[3] = { 1234U, 5678U, 9101U };
memcpy(packet->data32, data, sizeof(data));
packet->length = sizeof(data);
csp_send(connection, packet, 1000);

Server code snippet:

uint32_t *data = (uint32_t *)(packet->data32);
printf("Packet received on %i: %u\r\n", PORT, data[0]);
printf("Packet received on %i: %u\r\n", PORT, data[1]);
printf("Packet received on %i: %u\r\n", PORT, data[2]);
printf("Packet received on %i: %u, %u, %u\r\n", PORT, data[0], data[1], data[2]);

Output this code produces:

Packet received on 15: 2182284498
Packet received on 15: 5678
Packet received on 15: 9101
Packet received on 15: 80904723, 372113408, 596443136

Output a casual reader of this code would expect:

Packet received on 15: 1234
Packet received on 15: 5678
Packet received on 15: 9101
Packet received on 15: 1234, 5678, 9101

After some fiddling he's told me that he gets the correct output if he doesn't cast the data32 member of the struct to a uint32_t*.

From my own research, packet is of type csp_packet_t, which is defined as:

typedef struct __attribute__((__packed__)) {
        uint8_t padding[CSP_PADDING_BYTES];     // Interface dependent padding
        uint16_t length;                        // Length field must be just before CSP ID
        csp_id_t id;                            // CSP id must be just before data
        union {
                uint8_t data[0];                // This just points to the rest of the buffer, without a size indication.
                uint16_t data16[0];             // The data 16 and 32 types makes it easy to reference an integer (properly aligned)
                uint32_t data32[0];             // - without the compiler warning about strict aliasing rules.
        };
} csp_packet_t;

The full header file is here.

This is GNU C, so zero-length arrays are allowed.

I do not know the word size or endianness of the architecture on either side.

So, simply put - what's going on here? Why does the cast matter?

Ben Burns
  • 14,978
  • 4
  • 35
  • 56
  • Just clarify. If you just replace your original first line of the server with `uint32_t *data = packet->data32;` then it starts working?? – Adrian Ratnapala Mar 27 '14 at 09:44
  • No, he deletes that line altogether and replaces `data[...]` with `packet->data32[...]` which, now that I say that, makes me almost certain that this is because the union is misaligned. – Ben Burns Mar 27 '14 at 09:47
  • 1
    You mean that the compiler (and Standard C?) guarantees unaligned access to `struct->arr[k]` but not directly to pointer-indirects. That would be reasonable behaviour. – Adrian Ratnapala Mar 27 '14 at 09:52
  • Yes, exactly that -- see my own answer below. – Ben Burns Mar 27 '14 at 09:59
  • Standard C never guarantees unaligned access in any circumstance – M.M Jul 24 '14 at 05:21

3 Answers3

2

2182284498 is 0x821304D2 where 0x04d2 is 1234 and the rest possibly is the packet data. More, without knowing how csp_send and the corresponding receiver looks (ie: show more code) is not possible to tell.

And this line: memcpy(packet->data32, data, sizeof(data)); is actually a buffer overflow error... since you do not allocate enough bytes for packet->data32

Ferenc Deak
  • 34,348
  • 17
  • 99
  • 167
  • the idea of the zero-length arrays is that the user has already allocated extra memory. So if we assume he got that right, `sizeof(data)` is OK (also I assume `csp_send` want `packet->length` to be the size of the body and not body + header). But none of These guesses explains why getting rid of the typecast helps. – Adrian Ratnapala Mar 27 '14 at 09:42
  • I agree that this isn't a complete set of code, and that is annoying, but I think there's enough here to answer it. Regarding the buffer overflow, there's a proper malloc happening which is not shown. – Ben Burns Mar 27 '14 at 09:42
  • We have too many complete sets of code on this site. I am happy for the executive summary, we can always ask for clarification in comments. – Adrian Ratnapala Mar 27 '14 at 09:53
1

I think this is an alignment issue.

Quoting fritzone:

2182284498 is 0x821304D2 where 0x04d2 is 1234 and the rest possibly is the packet data

This is because the union is a member of a packed struct, and it's 16 bits out of alignment. When accessing from the union member in the packed struct, the compiler works some magic to guarantee that the packed (misaligned) data is retrieved correctly. However, when casting to a uint32_t* the compiler loses the packing detail, and I believe that it assumes it's accessing data which is properly aligned.

Ben Burns
  • 14,978
  • 4
  • 35
  • 56
  • Probably the address itself isn't word aligned, but the CPU fetch operation does something wicked. For exanoke x86 there might be a fetch `%ax` even though `%eax` ends up being used. Then if the high bytes somehow get cleared, the rest of the array appear to work (at least of the Contents is smaller than 2^16). – Adrian Ratnapala Mar 27 '14 at 11:38
  • I figured I'd leave my answer unaccepted for a couple of days just in case somebody comes along and wishes to write a more detailed explanation. I'm confident that it's correct, but I don't think saying "alignment issue" and "undefined behaviour" is as good as if someone can say in detail what the compiler/machine actually do in this circumstance. – Ben Burns Mar 28 '14 at 00:24
-1

Arrays cannot have size zero. This code is using some non-standard C to even make the union compile. So what happens internally in that union, nobody knows. It is undefined behavior, as far as the C language is concerned.

Since the program does "anything" when encountering undefined behavior, it works as expected.

Community
  • 1
  • 1
Lundin
  • 195,001
  • 40
  • 254
  • 396
  • I believe they can as of C99? – Ben Burns Mar 27 '14 at 09:31
  • @BenBurns No. C99 introduced a well-defined, flexible array member as last element of a struct/union, but they are not declared in this way. – Lundin Mar 27 '14 at 09:32
  • Apologies, I was thinking of a flexible array. http://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html – Ben Burns Mar 27 '14 at 09:32
  • I should mention that gcc is the compiler - I'll update the question and add a link to the full header file. – Ben Burns Mar 27 '14 at 09:33
  • @BenBurns [Some info](http://stackoverflow.com/questions/17344745/how-to-use-flexible-array-in-c-to-keep-several-values/17345123#17345123) regarding flexible array members and GCC non-standard extensions. And it seems very likely that this is the source of the oddities. – Lundin Mar 27 '14 at 09:33
  • @Lundin it's true he should probably Change his code, but sincee gcc does support zero-length arrays, this doesn't look like it is causing his results. – Adrian Ratnapala Mar 27 '14 at 09:36
  • @Lundin - I'm willing to buy that this is weird, but it's definitely not undefined behaviour with this compiler. – Ben Burns Mar 27 '14 at 09:38
  • Or to put it in more friendly and actionable terms - if you can edit to provide a brief example which isolates this and causes gcc to exhibit similar behaviour, I'll happily accept your answer. – Ben Burns Mar 27 '14 at 09:44