-1

I am building a buffer that will be used in a class and wanted to know if the following is valid according to the C++ standard:

#include <iostream>
#include <cstdint>

int main() {
  alignas(std::int32_t) char A[sizeof(std::int32_t)] = { 1, 0, 0, 0 };
  std::int32_t* pA = new (&A) std::int32_t;
  std::cout << *pA << std::endl;
  return 0;
}

The buffer has been initialized as a char array of 4 bytes. Would doing a placement new on top of the structure allow me to access the bits underneath as an int32_t? Can I now access the memory space there as 4 chars (through the buffer object) or as 1 int32_t (through pA) without any violation to the standard? If not, is it possible to do some other way?

NOTE: Yes, I am aware of endianness, but in this context, endianness doesn't matter. That's a different discussion.

Adrian
  • 10,246
  • 4
  • 44
  • 110
  • No, now you can't access `A` directly (its lifetime is over) and reading `*pA` reads an uninitialized `int` which is also Undefined Behavior. – François Andrieux Nov 23 '21 at 03:35
  • @FrançoisAndrieux, Is there a way to get the compiler to see it as initialized? There must be as there are buffers that read from files and streams all the time. – Adrian Nov 23 '21 at 03:37
  • No, you can't. There is no way for to built-in types like this to share the same memory representation and also be simultaneously accessible. – François Andrieux Nov 23 '21 at 03:58
  • 2
    `memcpy` is the trick. – Eljay Nov 23 '21 at 04:04
  • Use std::bit_cast for type punning. – doug Nov 23 '21 at 04:59
  • This is *not* how placement `new` is meant to be used. As for accessing multiple bytes as an integer, check [my “answer” here](https://stackoverflow.com/a/69808921/8584929), it contains an example of that. Also, you need to carefully consider how things will differ on big-endian and little-endian architectures — is your integer 1 or 2²⁴? – Andrej Podzimek Nov 23 '21 at 06:26
  • Do you really want to depend of endianess? `int n = A[0] | A[1] << 8 | A[2] << 16 | A[3] << 24;` (using `std::byte` or `unsigned char`) would be independent of endianess. – Jarod42 Nov 23 '21 at 09:52
  • Yes, I am aware of endianness, but in this context, endianness doesn't matter. That's a different discussion. – Adrian Nov 23 '21 at 19:28
  • Thx @JaMiT. Typo fixed. – Adrian Nov 23 '21 at 19:45

1 Answers1

5

(Assuming int and int32_t are the same type for brevity)

In C++20, since A is an array of characters, an object of type int can be implicitly created in A, so only the following is needed:

alignas(int) char A[sizeof(int)] = { 1, 0, 0, 0 };
int * pA = reinterpret_cast<int*>(&A[0]);
std::cout << *pA << std::endl;

The problem with your original new (&A) int is that it ends the lifetime of the original char[sizeof(int)] object that was initialised, so its value cannot be read. You now have a default-initialized int, which is UB to read. Your original code is thus equivalent to:

int A;
std::cout << A << std::endl;

If you can't rely on C++ 20 implicit object creation, you can use type punning methods, which create objects of a different type with the same "bit pattern" (value representation). std::bit_cast (or a version implemented with std::memcpy) can be used:

char A[sizeof(int)] = { 1, 0, 0, 0 };
int B = std::bit_cast<int>(A);
std::cout << B << std::endl;

Or std::memcpy to directly copy the value representation:

char A[sizeof(int)] = { 1, 0, 0, 0 };
int B;
std::memcpy(&B, A, sizeof(int));
std::cout << B << std::endl;

You can use placement new to change the effective type of A from char[sizeof(int)] to int, like so:

alignas(int) char A[sizeof(int)] = { 1, 0, 0, 0 };
// `A` has effective type `char[sizeof(int)]`; cannot be accessed through `int*`

int * pA = new (&A) int(std::bit_cast<int>(A));
// `A` now has effective type `int`. This line should be optimised to do nothing to `A` at runtime.

std::cout << *pA << std::endl;

// Or using `A` directly
std::cout << *std::launder(reinterpret_cast<int*>(A)) << std::endl;
Artyer
  • 31,034
  • 3
  • 47
  • 75