12

I'm working on a mult-threaded program, but have a UI component that makes extensive use of std::shared_ptr to manage elements. I can guarantee that only one thread will ever use these shared_ptrs.

Is there a way to define a shared_ptr that doesn't incur the overhead of thread safe reference counting?

It could be based on boost::shared_ptr or std::shared_ptr.

EDIT: Thanks for answers mentioning intrusive_ptr. I neglected to mention that I also need weak_ptr functionality so that rules it out.

UPDATE: The answer for me is to use local_shared_ptr from Boost. See comment from 'he rambled'

mpipe3
  • 1,013
  • 1
  • 9
  • 20
  • 3
    is the overhead really that bad? – Anycorn Jul 06 '11 at 08:50
  • 2
    Agreed. The overhead is so minimal, that it shouldn't be the bottleneck. If you're having a bottleneck somewhere, it's not in shared_ptr. –  Jul 06 '11 at 08:51
  • 1
    @inestical: for this case you're probably right, but don't forget that shared_ptr allocates it's refcount on the heap. Doing that a million times might cause serious overhead. For example http://stackoverflow.com/questions/3628081/shared-ptr-horrible-speed – stijn Jul 06 '11 at 09:07
  • @stijn I would presume the allocation of the actual object stored by the shared_ptr would still take longer than allocation of the reference count. But yes, if you do it a million times it becomes an issue. – edA-qa mort-ora-y Jul 06 '11 at 09:12
  • 3
    That's what `make_shared` is for. – Puppy Jul 06 '11 at 09:43
  • 1
    The allocation of the reference count on the heap is necessary to make `shared_ptr` _work_; it has nothing to do with being thread-safe. The idea is that each `shared_ptr` has a pointer to this refcount. Without that, then the pointer objects can't share a reference, and it stops being a `shared_ptr`. The thread-safety in `shared_ptr` comes from the use of atomic types and atomic tests/increment/decrement operations. These are quite fast. – Nicol Bolas Jul 06 '11 at 10:16
  • 1
    @Nicol: "These are quite fast" - on platforms that support such operations natively. On some platforms (such as older ARMs), they can be prohibitively expensive. – Mike Seymour Jul 06 '11 at 10:52
  • There are times where the code could be copying tens of thousands of shared_ptrs so the overhead of the interlockedincrement (or equivalent) could make a difference. – mpipe3 Jul 07 '11 at 16:52
  • 3
    For the record, now boost has `local_shared_ptr` which does not have thread safety guarantee and also works with `weak_ptr`. – he rambled May 14 '18 at 19:56
  • 'he rambled' thanks that's the answer! If you enter that as an answer I'll accept it. – mpipe3 May 16 '18 at 08:09

6 Answers6

5

Andrei Alexandrescu talked about implementing your own single threaded shared pointer class (with some additional optimizations) at the CppCon 2014

See the video here

And the slides here

I really think the standard or boost should supply a template parameter for using atomic ref counting in their shared ptrs though...

onqtam
  • 4,356
  • 2
  • 28
  • 50
3

you could use intrusive_ptr, as it allows you to provide your own reference counting. If that reference counting is a simple increment/decrement of a variable you probably won't get any better performance than that.

stijn
  • 34,664
  • 13
  • 111
  • 163
1

Now adding this as the accepted answer. Boost local_shared_ptr is a single thread reference counted smart pointer that uses non-atomic operations for speed:

https://www.boost.org/doc/libs/1_65_0/libs/smart_ptr/doc/html/smart_ptr.html#local_shared_ptr

mpipe3
  • 1,013
  • 1
  • 9
  • 20
  • local_shared_ptr looks fantastic, except I don't see how this works with weak_ptr? It seems that creating a week_ptr around a local_shared_ptr and then calling lock() will return an actual shared_ptr, not a local_shared_ptr... and therefore presumably negating any benefit of the local_shared_ptr if most uses are weak_ptrs. What am I missing here? – Duncan May 28 '19 at 14:06
  • Good point! In my code the weak pointer is only used occasionally and the shared_ptr from lock() is only used within the scope of one function. As you point out if your code usages weak_ptr extensively and then retains the result of lock() then the benefits are lost. Hopefully, they'll add a local_weak_ptr. – mpipe3 May 29 '19 at 07:29
1

I have code where the overhead of copying shared_ptr has become an issue, and have used alternative techniques at that point. Let me first qualify that other comments are correct that the overhead of a shared_ptr is very low. I profiled this to actually find one of my trouble points. On my AMD64 Phenom calling a function that copies the shared_ptr takes about 12ns versus calling the same function with a normal pointer at around 1ns.

With those numbers in mind it is hard to imagine you'll get any kind of "safe" variant between a raw pointer and the shared_ptr. So what I do in this cases I either pass the actual pointer or a const & to the shared_ptr. Usually I put a mutex lock on the whole section of code so that I'm guaranteed to maintain the shared_ptr for the entire duration. You could hand-roll a single-threaded reference count, but what would be the point if you know it isn't being shared?

But consider the timings very carefully. Unless you are copying the shared_ptr thousands, or even tens of thousands, of times per second you won't notice the overhead of the shared_ptr.

In the GUI code on the same project I always use a shared_ptr, only the server code avoids it in a few key areas. There are just so many other things in the GUI that slow it down: avoiding shared_ptr would make no appreciable difference.

edA-qa mort-ora-y
  • 30,295
  • 39
  • 137
  • 267
  • Yes, I do have a situation where thousands of shared_ptrs could get copied hence in the interest in removing thread safety – mpipe3 Sep 14 '12 at 10:30
  • "You could hand-roll a single-threaded reference count, but what would be the point if you know it isn't being shared?" Presumably it *is* being shared, just not by multiple threads incrementing/decrementing concurrently. Right, there wouldn't be any point in this question otherwise. – Don Hatch Nov 07 '22 at 23:18
1

I suggest going with the Boost intrusive smart pointer.

There is also an implementation from Scott Meyer (here: http://www.aristeia.com/BookErrata/M29Source.html) as published in 'More Effective C++'

However, in case it helps, I yanked a simple refcounting pointer (with some support for polymorphic assignments and custom deletors). This one is decided unthread-aware.

Note: I misremembered that. The polymorphic asignments were in a variation for antoher project. I have that too but it doesn't support the custom deletor :) Let me know if anyone's interested; Of course it comes with separate unit tests for the feature

It comes with Unit tests (e.g. to check for the famous remove linked list node ordering bug). So you know what you get :)

/*
 * counted_ptr - simple reference counted pointer.
 *
 * The is a non-intrusive implementation that allocates an additional
 * int and pointer for every counted object.
 */
#ifndef COUNTED_PTR_H
#define COUNTED_PTR_H

#include <stdlib.h>

extern "C" bool mtx_unit_test_countedptr();

namespace MtxChess {

/* For ANSI-challenged compilers, you may want to #define
 * NO_MEMBER_TEMPLATES or explicit */

template <class X>
    struct FreeMallocPolicy
    {
        static void do_free(X* p) { if (p) ::free(p); p = 0; }
    };

template <class X>
    struct ScalarDeletePolicy
    {
        static void do_free(X* p) { if (p) delete p; p = 0; }
    };

template <class X>
    struct ArrayDeletePolicy
    {
        static void do_free(X* p) { if (p) delete[] p; p = 0; }
    };

template <class X,class _P=ScalarDeletePolicy<X> > class counted_ptr
{
public:
    typedef X element_type;

    explicit counted_ptr(X* p = 0) // allocate a new counter
        : itsCounter(0) {if (p) itsCounter = new counter(p);}
    ~counted_ptr()
        {release();}
    counted_ptr(const counted_ptr& r) throw()
        {acquire(r.itsCounter);}
    operator bool() const { return 0!=get(); }
    void clear() { (*this) = counted_ptr<X>(0); }
    counted_ptr& operator=(const counted_ptr& r)
    {
        if (this != &r) {
            auto_release keep(itsCounter);
            acquire(r.itsCounter);
        }
        return *this;
    }
    bool operator<(const counted_ptr& r) const
    {
        return get()<r.get();
    }
    bool operator==(const counted_ptr& r) const
    {
        return get()==r.get();
    }
    bool operator!=(const counted_ptr& r) const
    {
        return get()!=r.get();
    }

#ifndef NO_MEMBER_TEMPLATES
//  template <class Y> friend class counted_ptr<Y>;
    template <class Y> counted_ptr(const counted_ptr<Y>& r) throw()
        {acquire(r.itsCounter);}
    template <class Y> counted_ptr& operator=(const counted_ptr<Y>& r)
    {
        if (this != &r) {
            auto_release keep(itsCounter);
            acquire(r.itsCounter);
        }
        return *this;
    }
    template <class Y> bool operator<(const counted_ptr<Y>& r) const
    {
        return get()<r.get();
    }
    template <class Y> bool operator==(const counted_ptr<Y>& r) const
    {
        return get()==r.get();
    }
    template <class Y> bool operator!=(const counted_ptr<Y>& r) const
    {
        return get()!=r.get();
    }
#endif // NO_MEMBER_TEMPLATES

    X& operator*()  const throw()   {return *itsCounter->ptr;}
    X* operator->() const throw()   {return itsCounter->ptr;}
    X* get()        const throw()   {return itsCounter ? itsCounter->ptr : 0;}
    bool unique()   const throw()
        {return (itsCounter ? itsCounter->count == 1 : true);}

private:
    struct counter {
        counter(X* p = 0, unsigned c = 1) : ptr(p), count(c) {}
        X*          ptr;
        unsigned    count;
    }* itsCounter;

    void acquire(counter* c) throw()
    {   
        // increment the count
        itsCounter = c;
        if (c) ++c->count;
    }

    void release()
    {   
        dorelease(itsCounter);
    }

    struct auto_release
    {
        auto_release(counter* c) : _c(c) {}
       ~auto_release() { dorelease(_c); }
        counter* _c;
    };

    void static dorelease(counter* itsCounter)
    {
        // decrement the count, delete if it is 0
        if (itsCounter) {
            if (--itsCounter->count == 0) {
                _P::do_free(itsCounter->ptr);
                delete itsCounter;
            }
            itsCounter = 0;
        }
    }
};

} // EON

#endif // COUNTED_PTR_H

Unit tests (compiles as standalone)

/*
 * counted_ptr (cpp) - simple reference counted pointer.
 *
 * The is a non-intrusive implementation that allocates an additional
 * int and pointer for every counted object.
 */

#include "counted_ptr.hpp"
#include "internal.hpp"
#include <map>
#include <string>

namespace MtxChess { 

    namespace /*anon*/
    {
        // sensed events
        typedef std::map<std::string, int> Events;
        static Events constructions, destructions;

        struct Trackable
        {
            Trackable(const std::string& id) : _id(id) {    constructions[_id]++; }
            ~Trackable()                               {    destructions[_id]++; }
            const std::string _id;
        };

        typedef counted_ptr<Trackable> target_t;

        bool testBehaviour()
        {   
            static const counted_ptr<Trackable> Nil = target_t(0);
            bool ok = true;

            constructions.clear();
            destructions.clear();

            MTXASSERT_EQ(ok, 0ul,  constructions.size());
            MTXASSERT_EQ(ok, 0ul,  destructions.size());

            target_t a = target_t(new Trackable("aap"));

            MTXASSERT_EQ(ok, 1ul,  constructions.size());
            MTXASSERT_EQ(ok, 1,    constructions["aap"]);
            MTXASSERT_EQ(ok, 0ul,  destructions.size());

            MTXASSERT_EQ(ok, 0,    constructions["noot"]);
            MTXASSERT_EQ(ok, 2ul,  constructions.size());
            MTXASSERT_EQ(ok, 0ul,  destructions.size());

            target_t hold;
            {
                target_t b = target_t(new Trackable("noot")),
                         c = target_t(new Trackable("mies")),
                         nil = Nil,
                         a2 = a;

                MTXASSERT(ok, a2==a);
                MTXASSERT(ok, nil!=a);

                MTXASSERT_EQ(ok, 3ul,  constructions.size());
                MTXASSERT_EQ(ok, 1,    constructions["aap"]);
                MTXASSERT_EQ(ok, 1,    constructions["noot"]);
                MTXASSERT_EQ(ok, 1,    constructions["mies"]);
                MTXASSERT_EQ(ok, 0,    constructions["broer"]);
                MTXASSERT_EQ(ok, 4ul,  constructions.size());

                MTXASSERT_EQ(ok, 0ul,  destructions.size());

                hold = b;
            }

            MTXASSERT_EQ(ok, 1ul,  destructions.size());
            MTXASSERT_EQ(ok, 0,    destructions["aap"]);
            MTXASSERT_EQ(ok, 0,    destructions["noot"]);
            MTXASSERT_EQ(ok, 1,    destructions["mies"]);
            MTXASSERT_EQ(ok, 3ul,  destructions.size());

            hold = Nil;
            MTXASSERT_EQ(ok, 3ul,  destructions.size());
            MTXASSERT_EQ(ok, 0,    destructions["aap"]);
            MTXASSERT_EQ(ok, 1,    destructions["noot"]);
            MTXASSERT_EQ(ok, 1,    destructions["mies"]);
            MTXASSERT_EQ(ok, 4ul,  constructions.size());

            // ok, enuf for now
            return ok;
        }

        struct Linked : Trackable
        {
            Linked(const std::string&t):Trackable(t){}
            counted_ptr<Linked> next;
        };

        bool testLinked()
        {   
            bool ok = true;

            constructions.clear();
            destructions.clear();
            MTXASSERT_EQ(ok, 0ul,  constructions.size());
            MTXASSERT_EQ(ok, 0ul,  destructions.size());

            counted_ptr<Linked> node(new Linked("parent"));
            MTXASSERT(ok, node.get()); 
            node->next = counted_ptr<Linked>(new Linked("child"));

            MTXASSERT_EQ(ok, 2ul,  constructions.size());
            MTXASSERT_EQ(ok, 0ul,  destructions.size());

            node = node->next;
            MTXASSERT(ok, node.get()); 

            MTXASSERT_EQ(ok, 2ul,  constructions.size());
            MTXASSERT_EQ(ok, 1ul,  destructions.size());

            node = node->next;
            MTXASSERT(ok,!node.get()); 

            MTXASSERT_EQ(ok, 2ul,  constructions.size());
            MTXASSERT_EQ(ok, 2ul,  destructions.size());

            return ok;
        }

    }

} // EON

int main()
{   
    using namespace MtxChess;

    bool ok = true;
    ok = testBehaviour() && ok;
    ok = testLinked() && ok;

    return ok?0:1;
}
sehe
  • 374,641
  • 47
  • 450
  • 633
-1

Boost provides a macro you can define that will not use thread-safe reference counting.

Puppy
  • 144,682
  • 38
  • 256
  • 465
  • 2
    Isn't that a build time thing? i.e. build the boost library with non thread safe ref counting. I need thread safe ref counting elsewhere in the same code base. – mpipe3 Jul 07 '11 at 16:54