16

Let's say, I have an array of unsigned chars that represents a bunch of POD objects (e.g. either read from a socket or via mmap). Which types they represent and at what position is determined at runtime, but we assume, that each is already properly aligned.

What is the best way to "cast" those bytes into the respective POD type?

A solution should either be compliant to the c++ standard (let's say >= c++11) or at least be guaranteed to work with g++ >= 4.9, clang++ >= 3.5 and MSVC >= 2015U3. EDIT: On linux, windows, running on x86/x64 or 32/64-Bit arm.

Ideally I'd like to do something like this:

uint8_t buffer[100]; //filled e.g. from network

switch(buffer[0]) {
    case 0: process(*reinterpret_cast<Pod1*>(&buffer[4]); break;
    case 1: process(*reinterpret_cast<Pod2*>(&buffer[8+buffer[1]*4]); break;
    //...
}

or

switch(buffer[0]) {
    case 0: {
         auto* ptr = new(&buffer[4]) Pod1; 
         process(*ptr); 
    }break;
    case 1: {
         auto* ptr = new(&buffer[8+buffer[1]*4]) Pod2; 
         process(*ptr); 
    }break;
    //...
}

Both seem to work, but both are AFAIK undefined behavior in c++1). And just for completeness: I'm aware of the "usual" solution to just copy the stuff into an appropriate local variable:

 Pod1 tmp;
 std::copy_n(&buffer[4],sizeof(tmp), reinterpret_cast<uint8_t*>(&tmp));             
 process(tmp); 

In some situations it might be no overhead in others it is and in some situations it might even be faster but performance aside, I no longer can e.g. modify the data in place and to be honest: it just annoys me to know that I have the right bits at an appropriate location in memory but I just can't use them.


A somewhat crazy solution I came up with is this:

template<class T>
T* inplace_cast(uint8_t* data) {
    //checks omitted for brevity
    T tmp;
    std::memmove((uint8_t*)&tmp, data, sizeof(tmp));
    auto ptr = new(data) T;
    std::memmove(ptr, (uint8_t*)&tmp,  sizeof(tmp));
    return ptr;

}

g++ and clang++ seem to be able to optimize away those copies but I think this puts a lot of burden on the optimizer and might cause other optimizations to fail, doesn't work with const uint8_t* (although I don't want to actually modify it) and just looks horrible (don't think you would get that past code review).


1) The first one is UB because it breaks strict aliasing, the second one is probably UB (discussed here) because the standard just says that the resulting object is not initialized and has indeterminate value (instead of guaranteeing that the underlying memory is untouched). I believe the first one's equivalent c-code is well defined, so compilers might allow this for compatibility with c-headers, but I'm unsure of this.

Community
  • 1
  • 1
MikeMB
  • 20,029
  • 9
  • 57
  • 102
  • 1
    Didn't you answer your own question? `What is the best way to "cast" those bytes into the respective POD type?` `I'm aware of the "usual" solution to just copy the stuff into an appropriate local variable` – deviantfan Dec 07 '16 at 11:42
  • If the real question is how to solve `it just annoys me to know that I have the right bits at an appropriate location in memory but I just can't use them.` then, maybe, C++ isn't the right language for you, or at least objects are not the right thing. If you want bits, why use structs/classes for the data at all? Just take the byte array and modify it like you want. – deviantfan Dec 07 '16 at 11:42
  • 1
    @deviantfan: That solution isn't a "cast" and I listed also some objective reasons, why I'm not happy with it (overhead and in-place modification). The reason I'm using c++ is that it allows me (for the most part) to use powerful abstractions on the one hand, but to go down to the metal where needed (I'm doing a lot of microcontroller programming). This is one particular situation is the only one I encountered, where c++ doesn't give me enough control. – MikeMB Dec 07 '16 at 11:52
  • 1
    Or to be more precise: It gives me enough control (I could go and manually modify individual bytes as you said) but it would force me to work on a much lower level of abstraction than should be necessary or pay the price for the overhead of copying the data. – MikeMB Dec 07 '16 at 11:55
  • `It gives me enough control... pay the price for the overhead of copying the data.` I understand, but ... sometimes, we just can't have everything. The restrictions in the standard are real, and other than a) getting then standard modified and/or b) ensure that a specific platform and compiler won't ever have problems with this kind of UB; I'm pretty sure there is no magic-bullet-solution. – deviantfan Dec 07 '16 at 12:21
  • @daviantfan: `there is no magic-bullet-solution` That maybe true and I asked this question precisely to find that out. It never ceases to amaze me, what you can do in c++ that probably wasn't intended by the designers of this or that feature. That being said. I showed one way to achieve pretty much what I want in (what I believe to be) standards compliant c++ code. So it is possible - the question ins now how to do it best. – MikeMB Dec 07 '16 at 12:42
  • 1
    @deviantfan: *"The restrictions in the standard are real"* Some are, some aren't (it does have bugs), some are there because someone thought it would help optimizers, some are there, because the language was invented 20+ Years ago, some are there, because c++ has to work on a machines where a char has 11 bits, doesn't use two's complement and has non-contiguous memory. That's why I said, I'm also happy that works with the compilers I use (type punning through unions is e.g. a related feature that is not allowed by the standard, but supported by most compilers) – MikeMB Dec 07 '16 at 12:50
  • Does it really have to be portable to crazy unknown/hypothetical compilers? Relevant compilers implement -fno-strict-aliasing or don't need it. – harold Dec 07 '16 at 14:04
  • 1
    @harold: No, I explicitly mentioned the compilers that should be supported and the platforms I'm interested in. So yes, `-fno-strict-aliasing` is a possibility, but I'd prefer something that works with the default compiler settings, because 3 Years from now someone will probably copy the code to another project and forgets that the specific settings are necessary (Of course the same problem might apply, if one relies on a compiler extension). – MikeMB Dec 07 '16 at 15:26
  • @Dan: That is a common misconception: At least in C++ (I don't know about C) You may use `char*` to refer to the memory of any object, but you can't use any pointer to refer to an array of chars. But thanks for having a look anyways. – MikeMB Dec 07 '16 at 15:44
  • @harold: Sorry, I'm just seeing, that my previous comment was pretty garbled so it's not quite clear what my point was. – MikeMB Dec 07 '16 at 15:45
  • I am not sure if I am correct. And even if I am, it might not be applicable here. If you can declare the buffer to be `void *`, then you can later `static_cast` it to the desired type. Basically you tells the compiler that the memory location is "typeless" before the cast. This should be safe. – Yan Zhou Dec 07 '16 at 18:18
  • 1
    I seem to recall that at least some versions of GCC will consider your second approach to clobber the memory and optimize accordingly. – T.C. Dec 08 '16 at 04:02
  • @T.C.: What do you mean by optimize accordingly? – MikeMB Dec 08 '16 at 22:09
  • Previous writes not otherwise read from are considered dead stores and optimized away, for instance. (Check `-flifetime-dse`) – T.C. Dec 09 '16 at 00:41
  • @T.C.: I see. IIRC it wasn't a problem when I tried it, but I don't remember which compiler's I tested it with and what the program looked like exactly. – MikeMB Dec 09 '16 at 07:04

2 Answers2

1

The most correct way is to create a (temporary) variable of the desired POD class, and to use memcpy() to copy data from the buffer into that variable:

switch(buffer[0]) {
    case 0: {
        Pod1 var;
        std::memcpy(&var, &buffer[4], sizeof var);
        process(var);
        break;
    }
    case 1: {
        Pod2 var;
        std::memcpy(&var, &buffer[8 + buffer[1] * 4], sizeof var);
        process(var);
        break;
    }
    //...
}

There main reason for doing this is because of alignment issues: the data in the buffer may not be aligned correctly for the POD type you are using. Making a copy eliminates this problem. It also allows you to keep using the variable even if the network buffer is no longer available.

Only if you are absolutely sure that the data is properly aligned can you use the first solution you gave.

(If you are reading in data from the network, you should always check that the data is valid first, and that you won't read outside of your buffer. For example with &buffer[8 + buffer[1] * 4], you should check that the start of that address plus the size of Pod2 does not exceed the buffer length. Luckily you are using uint8_t, otherwise you'd also have to check that buffer[1] is not negative.)

G. Sliepen
  • 7,637
  • 1
  • 15
  • 31
  • I already said, that I'm assuming aligned data and according to the standard, you can't use the first solution, even, if the data is aligned. If you are saying that a particular compiler officially supports this, I'd ask you to name it. And yes, I omitted any checks for the sake of brevity. Sorry for that. – MikeMB Dec 08 '16 at 00:18
  • 1
    memcpy is the correct solution here. See: https://gist.github.com/socantre/3472964 – user673679 Dec 26 '17 at 16:53
-1

Using a union allows to escape anti-aliasing rule. In fact that is what unions are for. So casting pointers to a union type from a type that is part of the union is explicitly allowed in C++ standard (Clause 3.10.10.6). Same thing is allowed in C standard (6.5.7).

Therefore depending on the other properties a conforming equivalent of your sample can be as follows.

union to_pod {
    uint8_t buffer[100];
    Pod1 pod1;
    Pod1 pod2;
    //...
};

uint8_t buffer[100]; //filled e.g. from network

switch(buffer[0]) {
    case 0: process(reinterpret_cast<to_pod*>(buffer + 4)->pod1); break;
    case 1: process(reinterpret_cast<to_pod*>(buffer + 8 + buffer[1]*4)->pod2); break;
    //...
}
Alan Milton
  • 374
  • 4
  • 13
  • 2
    Casting around pointer types might be allowed, but reading from a union member that was not previously written to is UB in c++. – MikeMB Sep 04 '17 at 02:11
  • Where is it written? – Alan Milton Sep 08 '17 at 21:24
  • In the c++ standard ;). Sorry, I'm currently on my mobile, so looking up the precise location would be somewhat painful. That is - by the way - a difference between c and c++. You will also find plenty of answers here on SO saying the same. – MikeMB Sep 09 '17 at 06:28
  • I would prefer the specific location, because the references I mentioned contradict to what you just said. – Alan Milton Sep 13 '17 at 16:22
  • This seem more controversial than I remembered. Have a look here, where this is discussed in depth (albeit for c++11): https://stackoverflow.com/questions/11373203/accessing-inactive-union-member-and-undefined-behavior – MikeMB Sep 13 '17 at 22:52
  • So from what I can see in the discussion and from my own reading and understanding the standard, the behavior is perfectly defined for objects with trivial constructors and destructors. POD objects conform to the limitation. – Alan Milton Sep 14 '17 at 20:22
  • Have you read the last sentence of the accepted answer? *"That is, although we can legitimately form an lvalue to a non-active union member (which is why assigning to a non-active member without construction is ok) **it is considered to be uninitialized**."*. Also, I wonder, why the only thing the standard explicitly permits in 9.5.1 is to inspect the common initial sequence when - according to your interpretation - you'd be allowed to reinterprete any bits as any pod member type. – MikeMB Sep 14 '17 at 21:47
  • Have you read the previous sentence? Also you claimed the undefined behavior is in the standard. Could you please give me a link to the standard where it is written? – Alan Milton Sep 14 '17 at 23:08
  • Which one? The one ending with *"[...] access without one of the above interpositions is undefined behaviour."*? None of those "interpositions" is happening here. And no, other than the quotes provided in the linked Q/A and the conclusion drawn from them I can't give you a reference. As I said, this is less clear cut than I remembered it. Just one additional point regarding clause 3.10.10: Being on this list is a neccessary, but NOT SUFFICIENT condition for an access not being UB. E.g. reading from an uninitialized int variable is generally UB (8.5.12) even though it satisfies 3.10.10.1. – MikeMB Sep 15 '17 at 11:40
  • 3.10.10.1 is not applied here because of 3.10.10.6 See this answer https://stackoverflow.com/a/45449172/2102000 – Alan Milton Sep 26 '17 at 19:35
  • Your example is wrong for multiple reasons. Type-punning via unions exists only in C (not C++), but not the way you suggest. Please see this document: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#Type-punning ``` Similarly, access by taking the address, casting the resulting pointer and dereferencing the result has undefined behavior, even if the cast uses a union type, e.g.: int f() { double d = 3.0; return ((union a_union *) &d)->i; } ``` So it explicitly states that your example is invalid even in C. – stsp Feb 07 '19 at 22:53
  • If so, could you please point to the relevant piece of the standards? So far the author of the question could not provide any. – Alan Milton Mar 04 '19 at 17:14