11

I have a collection of Creature objects that are created and owned in one part of my application using std::make_shared and std::shared_ptr.

I also keep track of a selection of zero or one Creature in a World object using a std::weak_ptr<Creature>.

void World::SetSelection(const std::shared_ptr<Creature>& creature) {
    selection = creature;
}

std::shared_ptr<Creature> World::GetSelection() const {
    return selection.lock();
}

The caller of GetSelection is responsible for checking if the pointer is empty. If it is, that means there is currently no selection.

This all works perfectly to my liking: when the selected Creature dies of natural causes (elsewhere in the application), GetSelection starts returning a nullptr again as if nothing was ever selected.

However in that case the World::selection member still points to the std::shared_ptr's control block. This can be quite big because I use std::make_shared to create my Creature objects (I realize that the Creature object was correctly destroyed at the right time but the memory for it is still allocated). I'm considering changing GetSelection to this:

std::shared_ptr<Creature> World::GetSelection() {
    const auto ret = selection.lock();
    if (!ret)
        selection.reset();

    return ret;
}

This frees up the memory as soon as I notice it's not needed anymore. Annoyingly, this version of GetSelection cannot be const.

My questions:

  1. Which version of GetSelection would be considered best practice in this situation?

  2. Does the answer change if something similar happens in templated code, where sizeof(T) is unknown and could be huge? Or in C++14 where std::make_shared<T[]> could be involved?

  3. If the second version is always best, what is the rationale for std::weak_ptr<T>::expired and lock not doing it themselves?

slajoie
  • 440
  • 8
  • 20
  • I will post my own (not completely satisfactory) answer in a little while but I would really like to see what others have to say. – slajoie Aug 12 '14 at 22:08
  • 1
    You can always make the `weak_ptr` member `mutable` in the second case if you want `GetSelection` to remain `const`. – T.C. Aug 12 '14 at 22:49
  • I don't quite understand your writing. You're saying that after `selection.lock()` has returned an empty `std::shared_ptr`, somehow the previously managed object is *not* deallocated? – Felix Glas Aug 12 '14 at 22:58
  • @T.C. Correct, but then I would say "Annoyingly, this version requires my selection member to be mutable". The point is that I'm not actually changing the observable state of the `weak_ptr` object at all, I'm just trying for a little optimization that arguably could've been done inside `weak_ptr`. This ties directly into my third question (which I think I know the answer to, but I'm hoping for better informed opinions than mine). – slajoie Aug 12 '14 at 23:00
  • @Snps I'm saying the _memory_ for it is not deallocated; it is well destroyed though. This is because of an optimization in `make_shared`. See the notes section of [http://en.cppreference.com/w/cpp/memory/shared_ptr/make_shared]. – slajoie Aug 12 '14 at 23:06
  • @slajoie Ok, I had not realized this before. – Felix Glas Aug 12 '14 at 23:47
  • 2
    Related info: [Do std::weak_ptrs affect when the memory allocated by std::make_shared is deallocated?](http://stackoverflow.com/q/13535827/873025) and [How does weak_ptr work?](http://stackoverflow.com/a/5671308/873025) – Felix Glas Aug 12 '14 at 23:48
  • @Snps Thanks for the links. I thought I had read all the questions on the subject but I guess I relied on tags too much. – slajoie Aug 13 '14 at 02:39

2 Answers2

4

It should be noted, first off, that the emplacement strategy of std::make_shared is optional, i.e. the standard does not mandate that implementations perform this optimization. It's a non-binding requirement, which means that perfectly conforming implementations may elect to forego it.

To answer your questions:

  1. Given that you seem to have only one selection (and you're not therefore bloating your memory use by keeping many of these control blocks around), I'd argue for keeping it simple. Is memory a bottleneck? This screams micro-optimization to me. You should write the simpler code, where you can apply const, and then go back and optimize later if the need arises.

  2. The answer doesn't unconditionally change, it changes conditional upon the problem domain and what your bottleneck is. If you're allocating one object that's "huge" (say a hundred kilobytes), and the space for that object is kicking around in a relatively unused control block until replaced, that probably isn't your bottleneck, and probably isn't worth writing more code (which is intrinsically more error prone, difficult to maintain, and decipher) to "solve".

  3. As std::weak_ptr::lock and std::weak_ptr::expired are const, under the interpretation of const for C++11, they must be thread safe. Therefore, given some std::weak_ptr, it must be safe to call any combination of lock() and expired() concurrently. Under the hood, the std::weak_ptr stores a pointer to the control block, which it looks through to examine/increment/etc. atomic counters to determine if the object has expired, or to see if it can acquire the lock. If you wanted to implement your optimization internal to std::weak_ptr, you'd have to somehow inspect the control block's state, and then atomically remove the pointer to the control block if the pointer has expired. This would cause overhead (even if this could be done simply with atomics, it would still have overhead) on every access to the std::weak_ptr, all for the sake of a small optimization.

  • I included the code of my current situation because that's what got me thinking about this and to try to make the question as clear as possible. I agree with your answer to my first question (and I think it will remain valid even when I implement multi-selection and have potentially thousands of those "wasted" blocks) but I think the other 2 questions are a lot more interesting, and that's where I disagree (see further comments). – slajoie Aug 13 '14 at 02:13
  • Your answer to my second question doesn't really add anything to what was already in the question. My point was that if I'm writing template code that has no idea about the problem domain or where the bottlenecks may be when the template is instantiated, shouldn't I be erring on the side of prudence even if that means writing a bit more difficult code? The case of `make_shared` in particular seems fairly strong to me. – slajoie Aug 13 '14 at 02:19
  • As for the third answer: yes, `lock` and `expired` have to be thread-safe. However a completely empty `weak_ptr` is faster to lock (or check for expiration or destroy) than one that still points to a control block so I don't see where you get extra overhead (in fact it would be faster overall). The first call to `lock` after the shared object is destroyed would pay the `~weak_ptr` cost up front but after that every call would be less expensive (including destruction where you'd "reclaim" all the cost). This is an additional optimization on top of getting the block of memory freed earlier. – slajoie Aug 13 '14 at 02:34
  • @slajoie: In writing a template that does what you're asking about, calling `reset` wouldn't be erring on the side of prudence. You'd be writing code with an extra branch (slower), which is no longer `const` (i.e. callable from fewer contexts), **and** which gives up thread safety, for what? The possibility that in some, very specific and unlikely scenario you don't run into a memory use bottleneck? The usefulness of constness and thread safety will be far more important to a generic implementation than eager freeing in the vast majority of cases. – Robert Allan Hennigan Leahy Aug 13 '14 at 05:00
  • @slajoie: You seem to be missing the fact that the overhead for implementing the mechanism you propose in `lock` and `expired` wouldn't just be paid when you call `lock` or `expired` and the pointer has actually expired. It would be paid every time you interact with the `std::weak_ptr`, whether the underlying pointer has expired or not. – Robert Allan Hennigan Leahy Aug 13 '14 at 05:01
  • I did not intend to give up thread safety or constness. My problem is that I thought the idea could be implemented inside `weak_ptr` without extra synchronization because it already has some in there that it must use to do its current job, so why not do one more thing once per `weak_ptr` lifetime? But now I see what you meant all along: the existing thread-safe code only allows writing to the control block but it does *not* cover the pointer to it. Thanks for explaining (twice)! I posted my (modified) answer anyway because it has some extras people might find useful. – slajoie Aug 13 '14 at 08:13
1
  1. The first version of GetSelection is better for the vast majority of cases. This version can be const and doesn't need extra synchronization code to be thread-safe.

  2. In generic library code where the exact usage pattern cannot be predicted in advance, the first version is still preferred. However in a situation where synchronization code is already in place protecting access to the weak_ptr, it cannot hurt to slip in a call to reset to free memory and make later uses of the pointer faster. This very small optimization is not in itself worthy of putting in that synchronization code.

  3. Given the first two answers, this last question is moot. However here are two strong rationales for not having weak_ptr::lock automatically reset the pointer when it is found to be expired:

    • With that behavior it would be impossible to implement weak_ptr::owner_before, and thus use a weak_ptr as the key type in an associative container.

    • Also, even ordinary usage of weak_ptr::lock on a live object could not be implemented without extra synchronization code. This would cause a performance penalty far greater than the minor gain that could be expected from freeing memory more eagerly.

Alternative solution:
If the wasted memory is deemed to be an actual problem that needs to be solved (perhaps the shared objects are truly big and/or the target platform has very limited memory), another option is to create the shared objects with shared_ptr<T>(new T) instead of make_shared<T>. This will free the memory allocated for T even earlier (when the last shared_ptr pointing to it is destroyed) while the small control block lives separately.

slajoie
  • 440
  • 8
  • 20