4

Preface: I'm optimizing an old codebase.

I have a class named Token for which I added caching to for a subset of the tokens (but not all). Cached tokens may not be deleted as their pointers are stored in a permanent collection in memory for the whole lifetime of the program.

Unfortunately, the codebase has delete token all over the place. So what I did was add a bool cached member that is checked from inside Token::operator delete and destructor ~Token(), which returns from these respective functions immediately if cached is true.

My question is, is this undefined behavior, or is it okay for me to do? Is it okay to execute the delete operator on something that doesn't get deleted? Or will this bite me in the future?

class Token
{
    bool cached;
    void* data;

public:
    ~Token()
    {
        if (this->cached) return;
        free(data);
    }

    void operator delete(void* p)
    {
        if (((Token*)p)->cached) return;
        ::operator delete(p);
    }

    // operator new, constructor etc.

}
korri123
  • 347
  • 4
  • 10
  • add a minimal example of your `Token` class ... – OrenIshShalom Jun 04 '21 at 02:52
  • 1
    Checking `cached` inside of `~Token()` is useless, as the object is already in the process of being destroyed by that time. – Remy Lebeau Jun 04 '21 at 03:04
  • I added an example to clarify things. Though it just occurred to me now that if the class contains objects which have their own destructors, they would get destroyed and there's nothing I can do about that (not the case here but still). @RemyLebeau If I'm not mistaken the destructor is called before operator delete? – korri123 Jun 04 '21 at 03:12
  • So, the real goal is not preventing the destruction of `Token` objects, but the destruction of the `data` they contain internally? That is a completely different issue than what you originally presented. If that is the case, you probably don't need to overload the operators at all. It would help to see a [mcve] demonstrating how the tokens are actually being used. – Remy Lebeau Jun 04 '21 at 03:25
  • No, you misunderstand; I want neither Token nor data destroyed when cached is true. Token does not get freed if operator delete is configured to not free the underlying memory behind it. Neither does data if the destructor returns before freeing it. – korri123 Jun 04 '21 at 03:39
  • @korri123 your sentiment is you overload `operator delete`, then call `delete token` would not trigger the base delete but called your one , only do real delete on checking of `cache`. I think in `~Token` the `if` is not needed, so the token has `cached=false` already – nhatnq Jun 04 '21 at 03:40
  • Using `std::shared_ptr` would be cleaner IMO, but require to propagate th changes. – Jarod42 Jun 04 '21 at 07:49

1 Answers1

5

This is not OK. It is not UB to not free memory in operator delete, but it is UB to access anything inside the deleted object. delete token will first call the destructor for token, and then it will call operator delete. Even if you yourself do nothing inside the body of the destructor, once the body returns it will continue destroy all subobjects of the object, and once the whole destructor returns the whole object is considered to no longer exist. Even if your data members are scalars like bools or void*s, which implementations don't usually touch when destroying, the language considers them to be made inaccessible by the destructor. In your example, operator delete cannot access cached, and later, if you still have a "cached" pointer to the destroyed object around, using that pointer is also UB. This is why you receive a void* and not a Token* in operator delete, and also why operator delete is implicitly static (no this): it signifies that your object is already gone.

If you're in C++20 land, you can use destroying operator delete instead:

struct Token {
   // ...
   bool cached;
   void operator delete(Token *thiz, std::destroying_delete_t) {
       if(thiz->cached) return;
       thiz->~Token();
       ::operator delete(thiz);
   }
   ~Token() { /* clean up without worrying about cached */ }
};

When such an overload of operator delete exists, delete token will call the overload without calling the destructor itself, so you can safely choose to simply not destroy the object. Of course, now you're responsible for calling the destructor yourself.

If you can't do that, you're somewhat out of luck. You might try void Token::operator delete(void*) = delete; to find all uses of delete on Tokens so you can replace them. Preferably you'd replace your (I presume) Token*s with some kind of smart pointer (whether std::shared_ptr or something you write yourself), so that you no longer need to have as many deletes at all.

HTNW
  • 27,182
  • 1
  • 32
  • 60
  • Fantastic answer, thank you. I feel it might be worth it for me to do create a method named Delete which checks for cached before executing `operator delete`, and then do a search and replace. C++20 option looks good, if there's no downside to that (other than having to manually call destructor) then I might have a go at that first. – korri123 Jun 04 '21 at 03:55
  • Followup question: is it bad for me to treat the returned `void*` in `Token::operator new` as `Token*` and set the member cached to `true` there if `operator new` has [an extra parameter cached](https://stackoverflow.com/questions/8685469/c-overload-operator-new-and-provide-additional-arguments) as true? (provided the constructor does not alter the member at all) – korri123 Jun 04 '21 at 14:35
  • @korri123 Yes, that's also bad. `operator new` is called before the object is constructed; non-destroying `operator delete` is called after it has been destroyed. Neither can access any members of the object because the object does not exist. It's a `void*` for a *reason*. – HTNW Jun 04 '21 at 17:51