0

So I'm working on creating a ICMPv4 echo request and decided to roll my own struct to hold the packet. To make identifying the packet easy to identify in wireshark, I decided to put abcde into the data field.

struct icmpPacket{
    u_int8_t icmp_type:8, icmp_code:8;
    u_int16_t icmp_checksum:16, icmp_id:16, icmp_seqnum:16;
    char icmp_data[6]; //cheat a little bit, set the field just large enough to store "abcde";
    } __attribute__((aligned (16))) icmppckt; // icmp has an 8 byte header + 6 bytes of data

What I'm getting stuck on is how to make the compiler read the struct out as a series of 16 bit word

songyuanyao
  • 169,198
  • 16
  • 310
  • 405
Wusiji
  • 599
  • 1
  • 7
  • 24
  • _"how to make the compiler read the struct out as a series of 16 bit word"_. What do you mean? You may use a `union` but I don't know if it's what you mean. – Adriano Repetti Jul 10 '14 at 08:49
  • I'd like to pass the struct into a function that accepts u_int16_t* as parameter – Wusiji Jul 10 '14 at 08:50
  • You may use a `union` to have two _views_ of that struct. Otherwise a simple `reinterpret_cast(&yourStruct)` (in C++ or `((u_int16_t*)&yourStruct)` in C) will do its job. – Adriano Repetti Jul 10 '14 at 08:53
  • 2
    @AdrianoRepetti Using the result of a `reinterpret_cast(&yourStruct)` to access the object is undefined behavior as it breaks strict aliasing rules. – T.C. Jul 10 '14 at 09:18
  • @AdrianoRepetti The precondition is "If a standard-layout union contains several standard-layout structs that share a common initial sequence". An array isn't even a standard-layout struct. – T.C. Jul 10 '14 at 13:44
  • @T.C. an array will share as-is struct layout (if to be initialized first is array). structs can't be shared if their members doesn't match... – Adriano Repetti Jul 10 '14 at 13:49
  • @AdrianoRepetti The only the C++ standard guarantees w/r/t accessing via nonactive member of unions is that if you have 1) a standard-layout union that 2) has several *standard-layout structs* as members, who 3) share a common initial sequence, and 4) the union current contains one of those structs, then 5) it is permitted to inspect the common initial part of any of them. An array is not a struct, much less a standard-layout struct, and therefore this provision doesn't apply. – T.C. Jul 10 '14 at 13:54
  • @T.C. no, in C according to footnote **78a** of **§6.5.2.3#3**: _"If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation."_. In C++ it's still undefined (question in this case should be tagged C **or** C++). – Adriano Repetti Jul 10 '14 at 14:40

2 Answers2

1

The standard-compliant way to do this is via memcpy:

icmpPacket packet = { /* ... */ };
uint16_t buf[sizeof(icmpPacket) / sizeof(uint16_t)];
memcpy(buf, &packet, sizeof(icmpPacket));
/* Now use buf */

Modern compilers are clever enough to optimize this appropriately, without actually doing a function call. See examples with clang and g++).

A common compiler extension allows you to use unions, though this is undefined behavior under the C++ standard:

union packet_view{
    icmpPacket packet;
    uint16_t buf[sizeof(icmpPacket) / sizeof(uint16_t)];
};
icmpPacket packet = { /* ... */ };
packet_view view;
view.packet = packet;
/* Now read from view.buf. This is technically UB in C++ but most compilers define it. */

Using a reinterpret_cast<uint16_t*>(&packet) or its C equivalent would break strict aliasing rules and result in undefined behavior. §3.10 [basic.lval]/p10 of the C++ standard:

If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:

  • the dynamic type of the object,
  • a cv-qualified version of the dynamic type of the object,
  • a type similar (as defined in 4.4) to the dynamic type of the object,
  • a type that is the signed or unsigned type corresponding to the dynamic type of the object,
  • a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic data members (including, recursively, an element or non-static data member of a subaggregate or contained union),
  • a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
  • a char or unsigned char type.

Similarly, §6.5/p7 of C11 says:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

  • a type compatible with the effective type of the object,
  • a qualified version of a type compatible with the effective type of the object,
  • a type that is the signed or unsigned type corresponding to the effective type of the object,
  • a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
  • a character type.
Community
  • 1
  • 1
T.C.
  • 133,968
  • 17
  • 288
  • 421
  • -1 casting is not undefined behavior: structure's memory layout is defined and also array's layout (otherwise even memcpy wouldn't work). It's disallowed by a _compiler optimization_ (that can be disabled locally). t.p. for unions is allowed in some circumstances (see other comment). – Adriano Repetti Jul 10 '14 at 10:55
  • @AdrianoRepetti No, strict aliasing is mandated by the standard. – T.C. Jul 10 '14 at 13:41
  • read last point: "a character type". Then: char* can "access" whatever you want. short* can "access" char* (and vice-versa). Then I see no reason short* can't "access" struct*... – Adriano Repetti Jul 10 '14 at 13:53
  • @AdrianoRepetti Yes, it's not UB if you cast to a `char *` (signed/unsigned) and dereference that. I'm not quite sure where you got the idea that the standard guarantees that you can legally cast a `char *` to a `short *` and deference it - you obviously can't, because the `char *` may not even be correctly aligned. By your logic, the strict aliasing rule would lost all meaning. – T.C. Jul 10 '14 at 14:02
  • arrays are _"contiguously allocated element subobjects"_ because **§8.3.4**. Array alignment is given by struct alignment. You're right about one point: it's part of standard, not of implementations. You may disable strict aliasing or...pass through a double casting to void* (allowed, as answered by Spektre). Part about union is still half wrong: in C it's allowed (but it's not in C++, see comment on question). – Adriano Repetti Jul 10 '14 at 14:46
  • @AdrianoRepetti I didn't say type punning via unions is illegal in C. I said it's UB in C++. And double cast via `void *` is simply an obscure way of writing `reinterpret_cast`, and doesn't fix any aliasing problems. – T.C. Jul 10 '14 at 14:48
  • my bad, I didn't see you limited discussion about union to C++. Double casting (ugly, I have to admit) avoid the problem simply because void* aliasing is allowed and compiler will assume rule isn't valid. Less elegant than an explicit (but non portable) __attribute__(may_alias). – Adriano Repetti Jul 10 '14 at 15:10
  • @AdrianoRepetti `reinterpret_cast` between two pointer types is defined as `static_cast(static_cast(expression))` - i.e., via a double cast through `void`. Perhaps some implementations assume aliasing when you do an explicit double cast, but that's not guaranteed by the standard. – T.C. Jul 10 '14 at 15:19
  • They have to assume aliasing with first cast to void* (or to char*). – Adriano Repetti Jul 10 '14 at 15:31
  • @AdrianoRepetti Not really. [g++](http://coliru.stacked-crooked.com/a/8701c998b9a1b59e), for example, still issues a strict aliasing warning even with a cast to `void *`. – T.C. Jul 10 '14 at 15:55
  • So bad for g++! Let me rephrase "they SHOULD - according to specification - an aliasing". – Adriano Repetti Jul 10 '14 at 15:57
  • @AdrianoRepetti No, they don't have to. They must assume that a `void *` or `char *` may alias anything, but when you cast a `void *` back to a pointer to a (non-char) type, then the compiler may assume that that pointer doesn't alias a pointer of a different type. – T.C. Jul 10 '14 at 16:12
  • Why? First cast will make first pointer to be considered aliased. A subsequent second cast shouldn't change this. – Adriano Repetti Jul 10 '14 at 16:21
  • @AdrianoRepetti Because the standard doesn't care how you got your pointer. The strict aliasing rule is defined in terms of accessing the stored value of an object of one type via a glvalue of another type. It's based on types, not some sort of "tainting" rule. Whether you insert a cast to `void *` or not, dereferencing the final pointer still gives you a glvalue of a different type, and the standard says that with the exceptions listed it's undefined behavior to access the stored value through it. – T.C. Jul 10 '14 at 16:39
  • @AdrianoRepetti This is why both gcc and clang will compile [this code](http://goo.gl/jdKP3F) into a `memset` even though the code as written has `P` aliasing `&P`. – T.C. Jul 10 '14 at 16:41
  • You convinced me, I agree about casting! (even if actually GCC doesn't complain if I do it...) – Adriano Repetti Jul 10 '14 at 20:07
0

you can use 16 bit pointers for that

  • but yout need to add aligning to 1 Byte of the structure elements !!!
  • in C++ you can do it like this:

    #pragma pack(1)
    struct icmpPacket
        {
        u_int8_t icmp_type:8, icmp_code:8;
        u_int16_t icmp_checksum:16, icmp_id:16, icmp_seqnum:16;
        char icmp_data[6]; //cheat a little bit, set the field just large enough to store "abcde";
        } icmppckt; // icmp has an 8 byte header + 6 bytes of data
    WORD *picmppckt16=(WORD*)((void*)&icmppckt);
    #pragma pack()
    
  • change WORD to 16 bit data type your compiler knows ...

Spektre
  • 49,595
  • 11
  • 110
  • 380