6

The C++ standard very clearly and explicitly states that using delete or delete[] on a void-pointer is undefined behavior, as quoted in this answer:

This implies that an object cannot be deleted using a pointer of type void* because there are no objects of type void.

However, as I understand it, delete and delete[] do just two things:

  • Call the appropriate destructor(s)
  • Invoke the appropriate operator delete function, typically the global one

There is a single-argument operator delete (as well as operator delete[]), and that single argument is void* ptr.

So, when the compiler encounters a delete-expression with a void* operand, it of course could maliciously do some completely unrelated operation, or simply output no code for that expression. Better yet, it could emit a diagnostic message and refuse to compile, though the versions of MSVS, Clang, and GCC I've tested don't do this. (The latter two emit a warning with -Wall; MSVS with /W3 does not.)

But there's really only one sensible way to deal with each of the above steps in the delete operation:

  • void* specifies no destructor, so no destructors are invoked.
  • void is not a type and therefore cannot have a specific corresponding operator delete, so the global operator delete (or the [] version) must be invoked. Since the argument to the function is void*, no type conversion is necessary, and the operator function must behavior correctly.

So, can common compiler implementations (which, presumably, are not malicious, or else we could not even trust them to adhere to the standard anyway) be relied on to follow the above steps (freeing memory without invoking destructors) when encountering such delete expressions? If not, why not? If so, is it safe to use delete this way when the actual type of the data has no destructors (e.g. it's an array of primitives, like long[64])?

Can the global delete operator, void operator delete(void* ptr) (and the corresponding array version), be safely invoked directly for void* data (assuming, again, that no destructors ought to be called)?

Kyle Strand
  • 15,941
  • 8
  • 72
  • 167
  • I wouldn't take "this answer", which I wrote a long time ago, as normative. –  Aug 01 '18 at 00:12
  • @NeilButterworth Well, it does quote the standard, does it not? Are you implying that a more recent standards might have changed the status of this operation? – Kyle Strand Aug 01 '18 at 00:15
  • Yes, it's entirely possible. I don't think it has changed, but I no longer track the standard. –  Aug 01 '18 at 00:17
  • 3
    Sure, why not? The language specification does not impose any requirements (that’s what “undefined behavior” means), so go ahead and guess what your implementation might do. What could go wrong? – Pete Becker Aug 01 '18 at 00:17
  • 1
    The standard says it is UB. UB can't happen in conforming code. The optimized can take advantage of this to remove the entire code path that contains UB. See examples: https://en.cppreference.com/w/cpp/language/ub and http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html – Richard Critten Aug 01 '18 at 00:24
  • @RichardCritten And you wouldn't consider that to be malicious compliance? – Kyle Strand Aug 01 '18 at 00:28
  • No. Writing an optimized is hard. Writing an optimised that does something sensible with undiagnosed invalid code (definition of UB) is impossible. – Richard Critten Aug 01 '18 at 00:35
  • @PeteBecker To counter snark with snark: was *anything* safe in C++ before 1998? More to the point, has there ever been, or do you think there ever will be, a *perfectly* conforming implementation of the standard? Less snarkily: I am not asking whether this is a *good* idea. But it's still worth knowing; in my case, I am working with legacy code that uses this antipattern, and need to know how urgent it is that we fix it. – Kyle Strand Aug 01 '18 at 00:36
  • 3
    The question seems to boil down to "Can I trust this particular compiler to know what I meant?" . And I'm not sure how anyone can really help with that. – M.M Aug 01 '18 at 00:54
  • @M.M. I did not mention a particular compiler (except in my edit, which is tangential). And I don't think the question has anything to do with what's "meant" in a mind-reading sense. When I say I can't think of any non-malicious way for a compiler to generate code from such an expression that would be unsafe, I mean that literally. – Kyle Strand Aug 01 '18 at 00:58
  • 1
    fwiw, i worked with this anti-pattern (against my will) for several years using MSVC. On Windows CE devices I fought heap corruption issues the entire time, while the desktop client seemed to work fine. Of course, that won't tell you much since our CE OS was customized by raving lunatics and the general quality of the code was so-so. I ended up embedding checksums in the larger structures :( Awful. – zzxyz Aug 01 '18 at 01:09
  • @KyleStrand — “working with legacy code” makes this an entirely different question. The question is not whether this is theoretically sound; it’s how can you best ensure that your application will work correctly. You have two choices: fix it, or write a bunch of tests and cross your fingers for luck. The latter is scary, but may be necessary. Good luck! – Pete Becker Aug 01 '18 at 02:05
  • I made a further comment in the [chat](https://chat.stackoverflow.com/rooms/177200/c-deleting-a-void-pointer-is-ub). – Arne Vogel Aug 01 '18 at 13:49
  • 1
    @RichardCritten: Writing an optimizer that can handle code written in dialects that supplement the behaviors defined by the Standard with those which are commonly used in fields like embedded and systems programming isn't particularly difficult if one makes any bona fide effort whatsoever to do so. – supercat Aug 02 '18 at 22:53
  • 1
    @M.M: No mind reading is required. If one part of the Standard or an implementation's documentation describes the behavior of a piece of code, but it falls in a general category of actions which another part of the Standard says is undefined, quality implementations targeting a particular platform and field should give precedence to the part that treats it as defined in cases where doing so is likely to be useful and practical for code targeting that platform and field. – supercat Aug 02 '18 at 23:00
  • @supercat Thank you for that last comment--it really gets at the heart of why I asked this. It's not a logical *inconsistency* in the standard per se, but it really seems like (at least for arrays, where allocation-size metadata is required) the requirements that *are* imposed by the standard would make it trickier *not* to provide a reasonable behavior than to just do the deallocation. – Kyle Strand Aug 02 '18 at 23:04
  • @KyleStrand: The authors of C89 and all C or C++ standards since have regarded as equivalent actions whose behavior is actually defined, and actions whose behavior obviously should be defined but actually isn't. In all of them, for example, `struct S {int i;} s; s.i=1;` violates a runtime constraint since it uses an lvalue or glvalue of type `int` to access a `struct S`, but such behavior would be sufficiently obviously absurd that even the authors of gcc and clang would recognize it as stupid. – supercat Aug 02 '18 at 23:15

5 Answers5

3

A void* is a pointer to an object of unknown type. If you do not know the type of something, you cannot possibly know how that something is to be destroyed. So I would argue that, no, there is not "really only one sensible way to deal with such a delete operation". The only sensible way to deal with such a delete operation, is to not deal with it. Because there is simply no way you could possibly deal with it correctly.

Therefore, as the original answer you linked to said: deleting a void* is undefined behavior ([expr.delete] §2). The footnote mentioned in that answer remains essentially unchanged to this day. I'm honestly a bit astonished that this is simply specified as undefined behavior rather than making it ill-formed, since I cannot think of any situation in which this could not be detected at compile time.

Note that, starting with C++14, a new expression does not necessarily imply a call to an allocation function. And neither does a delete expression necessarily imply a call to a deallocation function. The compiler may call an allocation function to obtain storage for an object created with a new expression. In some cases, the compiler is allowed to omit such a call and use storage allocated in other ways. This, e.g., enables the compiler to sometimes pack multiple objects created with new into one allocation.

Is it safe to call the global deallocation function on a void* instead of using a delete expression? Only if the storage was allocated with the corresponding global allocation function. In general, you can't know that for sure unless you called the allocation function yourself. If you got your pointer from a new expression, you generally don't know if that pointer would even be a valid argument to a deallocation function, since it may not even point to storage obtained from calling an allocation function. Note that knowing which allocation function must've been used by a new expression is basically equivalent to knowing the dynamic type of whatever your void* points to. And if you knew that, you could also just static_cast<> to the actual type and delete it…

Is it safe to deallocate the storage of an object with trivial destructor without explicitly calling the destructor first? Based on, [basic.life] §1.4, I would say yes. Note that, if that object is an array, you might still have to call the destructors of any array elements first. Unless they are also trivial.

Can you rely on common compiler implementations to produce the behavior you deem reasonable? No. Having a formal definition of what exactly you can rely on is literally the whole point of having a standard in the first place. Assuming you have a standard-conforming implementation, you can rely on the guarantees the standard gives you. You can also rely on any additional guarantees the documentation of a particular compiler may give you, so long as you use that particular version of that particular compiler to compile your code. Beyond that, all bets are off…

Michael Kenzel
  • 15,508
  • 2
  • 30
  • 39
  • You might add to your first and second sentences, "From the point of view of the compiler" or the like, to avoid the impression you are misunderstanding the OP's stipulations. I think this is an excellent answer. We understand this is a subset of `void *`. The compiler does not. – zzxyz Aug 01 '18 at 01:58
  • 2
    "Calling no destructor would be just as good as calling any random destructor" is a stretch. I struggle to find any case where I'd rather call a random destructor over no destructor. – zneak Aug 01 '18 at 02:02
  • 1
    Second, when considering whether it's safe to call the global deallocation function, it's useful to keep in mind that while you don't really know, the compiler doesn't really know, either. I haven't thought it through, but my intuition is that this determination is uncomputable, and the optimization benefits of knowing are negligible, so it's extremely unlikely that the compiler will care. Of course, it's important that you don't screw up, but most environments are controlled enough that there aren't that many deallocation functions to choose from. – zneak Aug 01 '18 at 02:03
  • " a bit astonished that this is simply specified as undefined behavior rather than making it ill-formed" - a reasonable guess would be because Undefined Behavior does not require a diagnostic, and as this question points out there is a reasonable behavior. – MSalters Aug 01 '18 at 10:44
  • 3
    It would not be difficult for a compiler's `new` operator to include within the allocation information about what destructors, if any, the object has, and for `delete` to make use of that information regardless of the pointer type fed to it. I would not be at all surprised if some compilers actually do that. If some compilers support a construct in useful fashion but others don't the Standard will usually allow compilers to support the behavior or not at their leisure, hopefully based upon what will benefit their customers. – supercat Aug 01 '18 at 14:50
  • I like most of this answer, but I completely agree with @zneak's criticisms, and I still hold by my assertion in my question (and supercat's comment and answer) that the "reasonable" behavior would in no way be "magical". – Kyle Strand Aug 02 '18 at 16:14
  • The statement that calling no destructor would be "just as good as calling any random one" was meant to be hyperbole. I just wanted to emphasize that what has simply been declared as "the only reasonable behavior" is not necessarily all that reasonable once you stop to think about it. The compiler cannot know which destructor or which deallocation function to call. Not calling a destructor and calling a likely wrong deallocation function is not something I would agree to call a reasonable choice. But I can see why someone might take issue with that statement, so I removed those sentences. – Michael Kenzel Aug 02 '18 at 17:42
  • @zneak As has already been pointed out by supercat, the compiler could simply store information about which destructor and which deallocation function to call for each allocation, so it doesn't have to be uncomputable. Doing so would, however, certainly go against the "you don't pay for what you don't use" philosophy. Concerning the "controlled environment", consider that C++ supports user-defined allocation/deallocation functions, so the set of deallocation functions is potentially arbitrarily large… – Michael Kenzel Aug 02 '18 at 17:51
  • @KyleStrand calling it "magical" was indeed unnecessary. I got carried away. I removed that bit… – Michael Kenzel Aug 02 '18 at 18:09
  • @MichaelKenzel, in this context, "uncomputable" means that the *compiler* cannot make that analysis, and that therefore it can't perform optimizations based on it. The compiler can defer verification work to the runtime, where every value is concretized, but if it did that, then we wouldn't have this problem in the first place: you'd always get the right destructor and the right deallocator. – zneak Aug 02 '18 at 18:46
  • Thanks. I think the answer is much better now. – Kyle Strand Aug 02 '18 at 19:37
  • ....though I would quibble with how much of a problem it would be in practice to just assume that the allocated memory came from the default allocator. As far as I know, having multiple allocators in a program is quite niche. – Kyle Strand Aug 02 '18 at 19:38
  • @zneak ok, I misunderstood that, I thought you meant uncomputable in general. Just for the compiler, I would also think it's uncomputable. – Michael Kenzel Aug 02 '18 at 22:13
  • @MichaelKenzel: The cost need not be very great. A single pointer attached to the allocation would easily suffice, and in some memory-management implementations the cost could be zero for types without destructors. – supercat Aug 03 '18 at 02:18
1

If you want to invoke the deallocation function, then just call the deallocation function.

This is good:

void* p = ::operator new(size);

::operator delete(p);  // only requires that p was returned by ::operator new()

This is not:

void* p = new long(42);

delete p;  // forbidden: static and dynamic type of *p do not match, and static type is not polymorphic

But note, this also is not safe:

void* p = new long[42];

::operator delete(p); // p was not obtained from allocator ::operator new()
Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • Why is the last bit of code not safe? Also, I'd really like a concrete explanation or example of how undesirable behavior could *actually be triggered in practice*. – Kyle Strand Aug 01 '18 at 00:40
  • 1
    @KyleStrand - Cell-based allocation could trigger undesirable behavior pretty easily if you decided not to bother storing the allocation size. (You lookup the cell size based on the object size, and that's how you know what allocation table to go to). It's a little bit of a stretch, for sure..but I've seen some fairly bizarre behavior from memory allocation routines. (Granted, none in commercial compilers) – zzxyz Aug 01 '18 at 00:44
  • 1
    @KyleStrand: Array new can put metadata (Standardese: *supplemental information*) at the beginning of the allocation, in front of the content. Then the `operator delete[](void*)` call needs the address of the allocation, and passing the address of the content, which is different, will fail. – Ben Voigt Aug 01 '18 at 00:52
  • @KyleStrand: Even when not dealing with an array, there is potentially trouble, because the Standard allows `new long(42)` to invoke either of two allocators, with or without an extra alignment argument, and the deallocator is required to match. – Ben Voigt Aug 01 '18 at 00:56
  • 1
    Here's the quote from the Standard which is important: " When a delete-expression is executed, the selected deallocation function shall be called with the address of the most-derived object in the delete object case, or the address of the object **suitably adjusted for the array allocation overhead** (8.3.4) in the delete array case, as its first argument." – Ben Voigt Aug 01 '18 at 01:00
1

While the Standard would allow an implementation to use the type passed to delete to decide how to clean up the object in question, it does not require that implementations do so. The Standard would also allow an alternative (and arguably superior) approach based on having the memory-allocating new store cleanup information in the space immediately preceding the returned address, and having delete implemented as a call to something like:

typedef void(*__cleanup_function)(void*);
void __delete(void*p)
{
  *(((__cleanup_function*)p)[-1])(p);
}

In most cases, the cost of implementing new/delete in such fashion would be relatively trivial, and the approach would offer some semantic benefit. The only significant downside of such an approach is that it would require that implementations that document the inner workings of their new/delete implementation, and whose implementations can't support a type-agnostic delete, would have to break any code that relies upon their documented inner workings.

Note that if passing a void* to delete were a constraint violation, that would forbid implementations from providing a type-agnostic delete even if they would be easily capable of doing so, and even if some code written for them would relies upon such ability. The fact that code relies upon such an ability would make it portable only to implementations that can provide it, of course, but allowing implementations to support such abilities if they choose to do so is more useful than making it a constraint violation.

Personally, I would have liked to see the Standard offer implementations two specific choices:

  1. Allow passing a void* to delete and delete the object using whatever type had been passed to new, and define a macro indicating support for such a construct.

  2. Issue a diagnostic if a void* is passed to delete, and define a macro indicating it does not support such a construct.

Programmers whose implementations supported type-agnostic delete could then decide whether the benefit they could receive from such feature would justify the portability limitations imposed by using it, and implementers could decide whether the benefits of supporting a wider range of programs would be sufficient to justify the small cost of supporting the feature.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • I think storing cleanup information in an allocation is, in general, not possible for a conforming implementation. [expr.new] [§11](http://eel.is/c++draft/expr.new#11) and [§12](http://eel.is/c++draft/expr.new#12) are, unfortunately, quite specific when it comes to the size of allocations made by a new expression. Arrays are basically the only exception where the compiler is allowed to request additional storage to what would be needed just to hold the created objects. – Michael Kenzel Aug 02 '18 at 22:22
  • @MichaelKenzel: If a user-supplied allocation function has not been installed, an implementation would so far as I can tell be free to allocate things as it sees fit (arguably it would be free to do so even when such a function has been called, though quality implementations should use typically call a user-provided function in preference to requesting their own heap allocation) . It would seem, though, that the intend of the Standard is to forbid what had been previously been a useful approach. – supercat Aug 02 '18 at 22:50
  • Note that these guarantees concerning the size of allocations made by `new` were already present in the C++03 standard, so that's not really a new addition. I'm not aware of any implementations that would actually have been doing anything like this "previously". If there were, I would argue that they could only have been doing so in violation of the standard. An implementation could always simply keep its own datastructures like, e.g., a map to track cleanup info for all active allocations. That would introduce a quite significant overhead of course… – Michael Kenzel Aug 02 '18 at 23:55
  • @MichaelKenzel: Many implementations of C++ predated the first published Standard. Implementing `delete` by using information stored by `new` for all types would not have been difficult, and pre-standard implementations did quite a few interesting things that have since fallen by the wayside. – supercat Aug 03 '18 at 02:14
0

void* specifies no destructor, so no destructors are invoked.

That is most likely one of the reasons it's not permitted. Deallocating the memory that backs a class instance without calling the destructor for said class is just all around a really really bad idea.

Suppose, for example, the class contains a std::map that has a few hundred thousand elements in it. That represents a significant amount of memory. Doing what you're proposing would leak all of that memory.

dgnuff
  • 3,195
  • 2
  • 18
  • 32
  • My question explicitly specifies that I'm only interested in the case where no destructors would be involved even in the correct `delete` expression. This means no non-POD classes. – Kyle Strand Aug 01 '18 at 00:37
  • Note, though, that you're correct; this is indeed the stated rationale for that footnote in the standard (and for the GCC and Clang warnings). – Kyle Strand Aug 01 '18 at 00:39
0

A void doesn't have a size, so the compiler has no way of knowing how much memory to deallocate.

How should the compiler handle the following?

struct s
{
    int arr[100];
};

void* p1 = new int;
void* p2 = new s;
delete p1;
delete p2;
Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
  • As I noted in my question, the deallocator function (operator delete) takes `void*`, so the size data is stored in memory at runtime rather than inferred from the type system. – Kyle Strand Aug 01 '18 at 03:09
  • @KyleStrand then why does the standard require both `delete` and `delete[]`? Surely if there's runtime information recorded then the difference is redundant. – Mark Ransom Aug 01 '18 at 03:27
  • I suppose I can imagine an implementation relying on type information to delete single items, but I'm not sure how it would handle a pointer to a base class, which can legally be used to delete an instance of a derived class. – Kyle Strand Aug 01 '18 at 03:52