When an overloaded operator new offsets a pointer by an additional prefix, how does the class of problematic cases look like

Question

In a legacy project I am maintaining in my freetime, operators delete/new and delete[]/new[] were overloaded to spot mismatches (new[] allocated object deleted and vice versa).

The original prefix had a length of 9 bytes. It has not led to issues since at least VS2010, and possibly even since VS6.

I have recently tackled rewriting this piece of code and to that end asked a question at codereview.stackexchange. The old prefixes had ended with one identical character which I removed, so my prefix was only 8 bytes long. Two people noted that this might break alignment and one referred to the C++ standard paragraph 6.11 Alignment... Unfortunately I fail to grasp the issue from reading it.

The second sentence there reads as follows:

An alignment is an implementation-defined integer value representing the number of bytes between successive addresses at which a given object can be allocated.

... And as far as I understand this definition means that all is well:

If I allocate a single object, the OS has to handle distance to the previous and next objects in dynamic memory. Distance to the previous object will only increase by length of the prefix. Such distances are presumably not part of alignment. So, okay.
If I allocate an array, alignment between its elements has been handled before operator new[] gets its size parameter. I do not change this, so okay.
- For begin and end of the array, considerations at 1) apply.

All seems to be perfectly fine. Yet, questions such as this one clearly signal that special alignment handling must be necessary in some cases.

What characterises these cases?
How can I generate such cases?
What invalidates my assumption that all is fine?

Or am I right and this is a perfectly harmless state of affairs?

Here is a code snippet illustrating the principle behind the overloads in question. Refer to my original question for a complete (and safer) example.

constexpr char PREFIX[] = "str1med";
constexpr std::size_t OFFSET = sizeof(PREFIX);

void * operator new(std::size_t size)
{
  void * pointer = std::malloc(size + OFFSET);
  std::memcpy(pointer, PREFIX, OFFSET);
  return reinterpret_cast<std::byte*>(pointer) + OFFSET;
}

void operator delete(void * untypedPointer)
{
  std::byte * pointer = reinterpret_cast<std::byte*>(untypedPointer);
  pointer -= OFFSET;
  assert(std::memcmp(pointer, prefix, OFFSET) == 0);
  std::free(pointer);
}

"Such distances are presumably not part of alignment. So, okay." Not so. The alignment requirement is a property of the type (known to the caller), just like its size (which is passed to `operator new()`). Applying an arbitrary offset (a fixed value you have chosen) will mean the address returned from your `operator new()` probably does not meet the actual type's alignment requirement. — Peter, Feb 11 '18 at 00:39
@Peter : This is an important point. I can see that one way of reading $6.11 would lead to this, whereas the way I have read it before would not. I was under the impression that alignment was a _minimal_ requirement (therefore, prefixing an allocation must be valid by default - minimal alignment has been handled outside of the operator, the prefix only adds to that). Using this interpretation, it is now easy to find relevant points of the standard, e.g. $6.8.1.1 or $6.11.9, that depend on alignment being correct in _both ways_. — Zsar, Feb 11 '18 at 11:53
A prefix of 8 bytes was fine, alignment was never better than 8. But you need to get ready for C++17, operator new acquired overloads that takes an std::align argument. — Hans Passant, Feb 17 '18 at 13:04
@HansPassant : But if I replace one of those, will it be automatically called from `Object x = new Object;` instead of the "simpler" overload? — Zsar, Feb 17 '18 at 13:28
No, only if the compiler deems it necessary due to an `alignas` specifier. Typical for variables that were optimized to work well with SSE or AVX code generation. That is a C++11 keyword, but without a decent way to implement it universally for the past 6 years :) — Hans Passant, Feb 17 '18 at 14:05

Ben Voigt · Answer 1 · 2018-02-11T01:56:12.983

You can generally infer the alignment requirement as being the largest power of two which is a factor of the requested size (this may be overly pessimistic at times). The occasions I'm aware of that require alignment better than 8 bytes on Windows are SIMD types and pointers used with FILE_FLAG_NO_BUFFERING.

Example:

auto required_alignment = size & (size ^ (size-1));

Unfortunately, the deallocator in most cases doesn't receive the size parameter, so you can't retrieve the offset using the same rule. However, if you encode the actual offset used during allocation in some fashion in the canary bytes immediately before the object, you can check the minimum size canary first, and from that recover the actual offset, original pointer, and check the full canary.

In your case, probably support for 16 byte alignment will suffice. Then you just need

auto align_16b = !(size & 0x0F);

and have a different canary for 16 byte aligned allocations. operator delete() then tests the preceding 8 bytes against both the 8 byte canary and the latter half of the 16 byte canary.

Important note: When the alignment requirement is greater than the alignment the underlying allocator provides, the offset may end up being different from the alignment. In this case operator delete() only needs to figure out the offset and doesn't care about the alignment requirement.

Mmh. This answer seems to be better suited to the motivating question at codereview than to this one? It seems to show how to provide a valid alignment in absence of alignment information (as is the case when implementing a replacement `operator new` (1-4)) but does not seem to contain any information about _why one should care_ to do so. — Zsar, Feb 17 '18 at 12:43
@Zsar: Nope, questions and answers about fixing code are off-topic on code review. Further, this question doesn't ask why alignment is important, it asks in what cases the naive implementation doesn't yield sufficient alignment. Which I answered in my first paragraph. — Ben Voigt, Feb 17 '18 at 15:31

score 1 · Answer 2 · answered Feb 11 '18 at 00:48

Check it out:

#define PREFIX "str1med"  // canary size = 8

constexpr std::size_t OFFSET = sizeof(PREFIX);

void *operator new(std::size_t size) {
  void * pointer = std::malloc(size + OFFSET);
  std::memcpy(pointer, PREFIX, OFFSET);
  return reinterpret_cast<char*>(pointer) + OFFSET;
}

void operator delete(void * untypedPointer) {
  char * pointer = reinterpret_cast<char*>(untypedPointer);
  pointer -= OFFSET;
  assert(std::memcmp(pointer, PREFIX, OFFSET) == 0);
  std::free(pointer);
}

int main() {
    int *p = new int;
    printf("%p\n", p);
}

This should print an 8-byte-aligned pointer value, which is suitably aligned for an int. But now change the canary string value from "str1med" to, let's say, "str10med".

int main() {
    int *p = new int;
    printf("%p\n", p);
}

Now this prints a pointer value whose low-order four bits are 0x9. It's not suitably aligned for an int anymore!

The problem with your original code is that if a maintainer or refactorer changes the length of the string PREFIX, it breaks operator new. This is highly unexpected behavior to most programmers; we're not used to thinking of the length of a string as "significant" unless it's called out explicitly. You could mitigate this problem in your code by calling out the dependency explicitly:

constexpr std::size_t OFFSET = sizeof(PREFIX);
static_assert(OFFSET % 8 == 0, "preserve 8-byte alignment of new'ed chunks");

(This would also tell the reader that 16-byte alignment of new'ed chunks is explicitly not one of your goals.)

This demonstrates the behaviour amiably, but it fails to convey how said behaviour is a problem: If I assign a value to `*p = -1;` and then `printf("%d\n", *p);` the result is still -1. If I allocate a `struct test { int i = -1;}` and then `printf("%d\n", p->i);` the result is still -1. How does it break anything and _what_ breaks if the least significant byte is 0x9 instead of 0x8? ... I mean, the original canary _was_ `"str1med0"`, so it _did_ have a length of 9 and yet, no undesired behaviour was ever encountered. — Zsar, Feb 11 '18 at 11:04
You'd have to use a non-x86 platform (such as PowerPC or ARM) to start seeing real frequent problems; on x86, misaligned loads and stores usually Just Work. Googling "misaligned loads on x86" turned up [this](https://www.kernel.org/doc/Documentation/unaligned-memory-access.txt) and [this](http://pzemtsov.github.io/2016/11/06/bug-story-alignment-on-x86.html). — Quuxplusone, Feb 11 '18 at 17:57

score 0 · Answer 3 · answered Feb 17 '18 at 12:34

[Note: This answer references Standard draft n4659.]

Given the quote of $6.11 from the question:

An alignment is an implementation-defined integer value representing the number of bytes between successive addresses at which a given object can be allocated.

it is easy to find that, should alignment requirements be violated, the most basic concept of object lifetime becomes undefined: Under $6.8.1.1 we can read

[The lifetime of an object of type T begins when] storage with the proper alignment and size for type T is obtained, [...]

An incorrectly aligned object therefore never begins its lifetime. It then stands to reason that the following behaviours would be "okay" for a Standard-compliant compiler:

never calling the constructor
immediately calling the destructor
Nasal Demons

Thus, alignment must be correct for the program to be wellformed.

But when is alignment incorrect?

As seen in the question, one possible interpretation of $6.11 is the following:

Be A the alignment required of class C.
- then objects Oi of class C must have at least A bytes between them
- if Oi have exactly A between them, they form an array

Under this interpretation, adding a prefix of arbitrary size to the memory to be occupied by a single object is always okay. Similarly, adding such a prefix to the memory to be occupied by an array of objects is always okay.

There is however a more strict interpretation of the same paragraph:

Be A the alignment required of class C.
- then objects Oi of class C must have exactly a multiple of A between them
- if Oi have exactly A between them, they form an array

This interpretation would mean that alignment A describes exactly the memory adresses where an object O may reside via the term 0 + A * j = &Oi for i, j in N. Given that memory is virtual and even if allocated may or may not physically exist, this is at least somewhat surprising.

However, @Peter has commented that this is indeed the correct interpretation. It immediately follows, that prefixing the memory allocated for any object must ensure that A + sizeof(prefix) % A = 0 or else Nasal Demons.

[Note: This answer could be improved by providing a quote from the standard which supports the later interpretation. Sadly even after a token amount of scanning I have not yet been able to come up with a fitting passage.]

You have to realize that (all real-world implementations of) the virtual memory mapping translates an entire page at a time, so it has no effect on the low group of bits which affect alignment. — Ben Voigt, Feb 17 '18 at 15:34
Also, due to the round-trip requirements on pointer casts, at a minimum the definition of alignment needs to include the entire set of types with the same alignment `A = alignof(C)` — Ben Voigt, Feb 17 '18 at 15:37
@BenVoigt : "the low group of bits which affect alignment" - Err, I fail to parse this, I fear. Could you rephrase, please? In any case (any possible grouping) I am not sure what you mean to say. Is it that alignment requirements do not skip page boundaries, therefore ` + A * j = &Oi for i, j in N` is a valid relaxation of the stricter interpretation? So at every whole multiple of `std::hardware_constructive_interference_size` the requirement resets? — Zsar, Feb 18 '18 at 00:56
It is more that alignment requirements are powers of two, and page sizes are larger powers of two. So that check whether `p % A == 0` can be rewritten as `(p & (A-1)) == 0` which not only is immensely easier to implement in hardware, it actually means the alignment is related to the value of the low order bits. **And this is no accident, the whole reason that alignment requirements exist is the way address lines are connected to memory chips on physical computing systems.** Maybe this old answer of mine will help: https://stackoverflow.com/a/3903577/103167 — Ben Voigt, Feb 26 '18 at 19:43

When an overloaded operator new offsets a pointer by an additional prefix, how does the class of problematic cases look like

3 Answers3