2

I have a block of memory allocated through char buffer, is it legal to view it via buffer of another type?

char* buffer = new char[1000];
int64_t* int64_view = static_cast<int64_t*>(static_cast<void*>(buffer))

Is int64_view[0] guaranteed to correspond to first 8 bytes of buffer? I am a bit concerned about aliasing, if the char buffer is only 1-byte aligned, and int64_t must be 8-byte aligned then how does compiler handle it?

Piotr Dabkowski
  • 5,661
  • 5
  • 38
  • 47

2 Answers2

3

Your example is violation of the strict aliasing rule. So, int64_view anyway will point to the first byte, but it can be unaligned access. Some platforms allow it, some not. Anyway, in C++ it's UB.

For example:

#include <cstdint>
#include <cstddef>
#include <iostream>
#include <iomanip>

#define COUNT 8

struct alignas(1) S
{
    char _pad;
    char buf[COUNT * sizeof(int64_t)];
};

int main()
{
    S s;
    int64_t* int64_view alignas(8) = static_cast<int64_t*>(static_cast<void*>(&s.buf));

    std::cout << std::hex << "s._pad     at " << (void*)(&s._pad) << " aligned as " << alignof(s._pad)     << std::endl;
    std::cout << std::hex << "s.buf      at " << (void*)(s.buf)   << " aligned as " << alignof(s.buf)      << std::endl;
    std::cout << std::hex << "int64_view at " << int64_view       << " aligned as " << alignof(int64_view) << std::endl;

    for(std::size_t i = 0; i < COUNT; ++i)
    {
        int64_view[i] = i;
    }

    for(std::size_t i = 0; i < COUNT; ++i)
    {
        std::cout << std::dec << std::setw(2) << i << std::hex << " " << int64_view + i << " : " << int64_view[i] << std::endl;
    }
}

Now compile and run it with -fsanitize=undefined:

$ g++ -fsanitize=undefined -Wall -Wextra -std=c++20 test.cpp -o test

$ ./test
s._pad     at 0x7ffffeb42300 aligned as 1
s.buf      at 0x7ffffeb42301 aligned as 1
int64_view at 0x7ffffeb42301 aligned as 8
test.cpp:26:23: runtime error: store to misaligned address 0x7ffffeb42301 for type 'int64_t', which requires 8 byte alignment
0x7ffffeb42301: note: pointer points here
 7f 00 00  bf 11 00 00 00 00 00 00  ff ff 00 00 01 00 00 00  20 23 b4 fe ff 7f 00 00  7c a4 9d 2b 98
              ^ 
test.cpp:31:113: runtime error: load of misaligned address 0x7ffffeb42301 for type 'int64_t', which requires 8 byte alignment
0x7ffffeb42301: note: pointer points here
 7f 00 00  bf 00 00 00 00 00 00 00  00 01 00 00 00 00 00 00  00 02 00 00 00 00 00 00  00 03 00 00 00
              ^ 
 0 0x7ffffeb42301 : 0
 1 0x7ffffeb42309 : 1
 2 0x7ffffeb42311 : 2
 3 0x7ffffeb42319 : 3
 4 0x7ffffeb42321 : 4
 5 0x7ffffeb42329 : 5
 6 0x7ffffeb42331 : 6
 7 0x7ffffeb42339 : 7

It works on x86_64, but there is undefined behavior and you pay with execution speed.

This example on godbolt

In C++20 there is bit_cast. It will not help you in this example with unaligned access, but it can resolve some issues with aliasing.

UPDATE: There is instructions on x86_64, that requires aligned access. For example, SSE, that requires 16-bit alignment. If you will try to use these instructions with unaligned access, application will crash with "general protection fault".

  • Awesome, thanks a lot! Just one follow-up: if I happen to ensure that char buffer is aligned with a multiple of 8 then there should be no issues, right (for int64_t)? – Piotr Dabkowski May 23 '20 at 13:08
  • You are welcome.) Yes, in this example. You can test it by removing _pad from S struct and setting alignment to 8. Sanitizer will not complain. In the more complex code compiler can fail to determine, that variables are pointing to the same memory, it can reorder save/load instructions, if you accessing memory via two pointers at the same time. I will try to make example. I can suggest to use -Wstrict-aliasing(for gcc and clang), that can(! there is no guarantee) warn you about aliasing and don't mix read/write to the same memory via pointers of different types in the adjacent lines. – Dmitrii Zabotlin May 23 '20 at 13:24
0

void* will definitely lead to UB. static_cast lost it's value when you cast your type first to the most generic type void*, because you can cast everything to/from void*. It is no different from using reinterpret_cast for casting straight from your type to any other pointer type.

Consider the following example:

int64_t* int64_view = reinterpret_cast<int64_t*>(buffer);

It might work, and it might not - UB.

Coral Kashri
  • 3,436
  • 2
  • 10
  • 22
  • Thanks, but why exactly doesn't it work? Is it due to lack of alignment guarantees? If int64_view points to the start of char buffer then I do not see a problem. If it is being offset by some value to guarantee 8-alignment then it will not work. – Piotr Dabkowski May 23 '20 at 10:48
  • Specifically, alignof(buffer) is 8, so this should always work as long as alignment is 8. Is that true? – Piotr Dabkowski May 23 '20 at 10:58
  • @PiotrDabkowski UB means that the behavior for this kind of action is not specified, and can be defined one day in the future. Which means it might be defined the way you think it's working now, and it can do anything else. It's highly not recommended to use those kind of casting. So- there are no guarantees at all, as long as it is UB. – Coral Kashri May 23 '20 at 12:21
  • Well, if the buffer ptr is aligned for int64_t (alignof(buffer) is multiple of 8) then the behaviour is defined. It turns out it is guaranteed to be aligned when new is used: https://stackoverflow.com/questions/506518/is-there-any-guarantee-of-alignment-of-address-return-by-cs-new-operation. I have run tests and the view always works as expected (first element of view points to 8 first elements of buffer). Im still not sure in what ways the UB would manifest when the buffer was unaligned? – Piotr Dabkowski May 23 '20 at 12:48
  • "Im still not sure in what ways the UB would manifest when the buffer was unaligned?" Technically, nobody is. There might not even be any manifestations, in practical terms. The behavior is _undefined_, which has to be taken literally. Theoretically any sort of problem **could** occur, depending on the whims of the compiler, architecture, etc. The standard doesn't concern itself at all with what _might_ happen, only in declaring that you've entered a world where they no longer make any guarantees about what **will** happen. – FeRD May 10 '21 at 02:35