11

According to the standard, it is always undefined behavior in C++ to make, for example, a float* point to the same memory location as a int*, and then read/write from them.

In the application I have, there can be a buffer filled with 32-bit integer elements, that are overwritten by 32-bit floating point elements. (It actually contains a representation of an image, that gets transformed in multiple stages by GPU kernels, but there should also be a host implementation that does the same processing, for verification.)

The program basically does this (not actual source code):

void* buffer = allocate_buffer(); // properly aligned buffer

static_assert(sizeof(std::int32_t) == sizeof(float), "must have same size");
const std::int32_t* in = reinterpret_cast<const std::int32_t*>(buffer); 
float* out = reinterpret_cast<float*>(buffer); 
for(int i = 0; i < num_items; ++i)
   out[i] = transform(in[i]);

Is there a way to make the reinterpret_cast pointer cases well-defined, within the C++ standard, without doing additional memory copies of the whole buffer, or additional per-element copies (for example with std::bit_cast)?

tmlen
  • 8,533
  • 5
  • 31
  • 84
  • 1
    The standard knows nothing about GPU's. So you are already in implementation defined territory. Why not just rely on your implementation (possibly with compiler switches) *making it* well defined? – StoryTeller - Unslander Monica Aug 20 '18 at 12:12
  • 2
    Use `no-strict-aliasing` flag. For std::bit_cast you will have to wait until at least C++20. There is no standard conform way without using memcpy. –  Aug 20 '18 at 12:15
  • why not work out what type you actually want to be working with; ie ints or floats; and then have in and out the same type. Your transform then deals with the conversion of float to int / visa versa. – UKMonkey Aug 20 '18 at 12:17
  • 1
    The first paragraph is wrong. It is OK to have pointers of different types pointing to the same location. What you aren't allowed to do is to read or write the memory as the 'wrong' type. – M.M Aug 20 '18 at 12:40
  • May find it useful to read [What is strict aliasing?](https://stackoverflow.com/a/51228315/1708801) compilers will treat memcpy used for type punning as a noop or at least quality implementations will. As i note in my my answer that I link to we have a implementation of bit_cast you can use, although you obviously can't get constexpr w/o implementation magic. – Shafik Yaghmour Aug 20 '18 at 13:07
  • Oppss just realized I forgot to include link to bit_cast proposal, fixed! – Shafik Yaghmour Aug 20 '18 at 13:15
  • @Pi as I pointed out [in my comment](https://stackoverflow.com/questions/51930334/buffer-filled-with-different-types-of-data-and-strict-aliasing#comment90812731_51930334) the impl of bit_cast is available but basically is just wraps `memcpy` the constexpr magic requires compiler support. – Shafik Yaghmour Aug 20 '18 at 16:33
  • @Shafik Yaghmour Possible implementations using memcpy are also stated in the links. Thanks for the heads up. –  Aug 21 '18 at 08:53
  • @Shafik Yaghmour Your implementation of bit_cast is very nice! Thanks for mentioning. –  Aug 21 '18 at 09:22
  • @tmlen See the answers to [this follow-up question](https://stackoverflow.com/questions/51931979/is-it-legal-to-reuse-memory-from-a-fundamental-type-array-for-a-different-yet-s/52006488#52006488). – Yuki Aug 24 '18 at 17:15

2 Answers2

7

Even though I wished all the time there would be a nice way, currently there is non. You will have to use no-strict-aliasing flag of the compiler of your choice.

For std::bit_cast you will have to wait until C++20. There is no standard conform way without using memcpy as far as I know.

Also have a look at this bit_cast proposal and this website.

  • 1
    There is also a proposal [p0593r2](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0593r2.html) which introduces `std::bless` that could be used in conjunction with `std::launder` to achieve similar functionality. – eerorika Aug 20 '18 at 13:59
  • I wonder what practical (as opposed to political) difficulty there would be with saying that the "aliasing rules" only apply in cases which *actually involve aliasing*, recognizing that aliasing requires that a region of storage which is modified during some particular execution of a function or loop is--within said execution--accessed via two pointers or references, *neither of which is visibly freshly derived [within that context] from the other*. – supercat Aug 20 '18 at 19:21
  • I also wonder if there would be any difficulty with saying that the result of applying `reinterpret_cast` to a reference will yield a reference that may, throughout its lifetime, be used objects of either the old or new type, provided that within that lifetime either (1) the object isn't modified by any means (but may be read via any means), or (2) the object accessed exclusively via that reference and references/pointers that are derived from it. That should be easy to implement, and shouldn't interfere with any *otherwise-useful* optimizations. – supercat Aug 20 '18 at 21:26
  • @supercat Why not ask a new question? –  Aug 21 '18 at 08:53
  • @user2079303 Very interessting, but the link is dead. Can you provide an alternative? –  Aug 21 '18 at 09:58
  • @Pi: Effective discussion on the issue has been crushed by politics. For years, discussion has been dominated by arguments over what the Standard requires. Such arguments fail to recognize that the Standard allows implementations that are of such low quality as to be useless, and makes no effort to fully describe what a *high-quality* implementation must do to be suitable for any particular purpose. I doubt the authors of gcc would be willing to admit that they've been fighting for the right to label a deliberately-inferior compiler as "conforming", but they've become... – supercat Aug 21 '18 at 14:31
  • ...heavily invested in an optimzer design that really isn't suitable for low-level programming, and insist that any code which isn't suitable for use with it is "broken". – supercat Aug 21 '18 at 14:32
0

How about using a union? For example:

union T {
    std::int32_t i;
    float f;
}

T* buffer = allocate_buffer();
for(int i = 0; i < num_items; ++i)
    buffer[i].f = transform(buffer[i].i);
Andrea Corbellini
  • 17,339
  • 3
  • 53
  • 69