Overhead and implementation of using shared_ptr

Question

Short introduction: I am working on multithread code and I have to share dynamically allocated objects between two threads. To make my code cleaner (and less error-prone) I want to explicitly "delete" objects in each thread and that's why I want to use shared_ptr.

First question:

I want to know if implementation of -> operator in shared_ptr has some extra overhead (e.g. larger then unique_ptr) during run time. Objects I am talking about are usually longlife instances copied only once after creation (when i distribute them between threads), then I only access these objects' methods and fields.

I am aware, that shared_ptr only protect reference counting.

Second question:

How well are shared_ptr optimized in libstdc++? Does it always use mutex or take advantage of atomic operations (I focus on x86 and ARM platforms)?

In a good implementation of `shared_ptr`, there should be zero overhead when dereferencing the pointer via `->`. I am not familiar with libstdc++, so I cannot answer your second question. You have the headers, though, so you can easily find out by taking a look at how it's implemented. — James McNellis, Jun 05 '12 at 17:49
If the code is multithreaded, GCC's shared pointer uses an `std::atomic` or something like that for the reference counter; whether that's a true hardware (lockfree) atomic depends on the compiler version -- I believe this was improved in GCC 4.7.0. — Kerrek SB, Jun 05 '12 at 17:52
Copy/assignment/going out of scope has extra overhead because of the threadsafe increment of the refcount. `operator->` looks exactly the same as the one of good old `auto_ptr`, i.e. can be expected to be zero overhead. — Damon, Jun 05 '12 at 17:55
This question in its current form is too broad to be answered. There are many implementations of `shared_ptr`, and there are many versions of GCC and libstdc++. Which one are you talking about? — Nicol Bolas, Jun 05 '12 at 18:31

score 14 · Accepted Answer · edited May 23 '17 at 12:18

14

First question: using operator->

All the implementations I have seen have a local cache of T* right in the shared_ptr<T> class so that the field is on the stack, operator-> has thus a comparable cost to using a stack local T*: no overhead at all.

Second question: mutex/atomics

I expect libstdc++ to use atomics on x86 platform, whether through standard facilities or specific g++ intrinsics (in the older versions). I believe the Boost implementation already did so.

I cannot, however, comment on ARM.

Note: C++11 introducing move semantics, many copies are naturally avoided in the usage of shared_ptr.

Note: read about correct usage of shared_ptr here, you can use references to shared_ptr (const or not) to avoid most of the copies/destruction in general, so the performance of those is not too important.

edited May 23 '17 at 12:18

Community

1
1

answered Jun 05 '12 at 20:02

Matthieu M.

287,565
48
449
722

in answer you attached it is said to use `make_shared`. I wonder how to use this template on constructor's initialization list? Example, class `Foo` has field `shared_ptr num`, so constructor should look like this: `Foo::Foo(void) : num (move (make_shared (new int (30)))) { ... }`? – Goofy Jun 06 '12 at 08:35
2

@Goofy: no, with `make_shared` you do not perform the `new` explicitly and on the other hand you need to pass the type of the created object explicitly; also the call to `move` is unnecessary on a temporary. Therefore it yields: `Foo::Foo(): num(std::make_shared(30)) {}` – Matthieu M. Jun 06 '12 at 08:41
Ok, great :) I am still getting used to rvalues in C++ ;) – Goofy Jun 06 '12 at 08:46

Jonathan Wakely · Answer 2 · 2012-06-06T14:49:11.607

13

GCC's shared_ptr will use no locking or atomics in single-threaded code. In multi-threaded code it will use atomic operations if an atomic compare-and-swap instruction is supported by the CPU, otherwise the reference counts are protected by a mutex. On i486 and later it uses atomics, i386 doesn't support cmpxchg so uses a mutex-based implementation. I believe ARM uses atomics for the ARMv7 architecture and later.

(The same applies to both std::shared_ptr and std::tr1::shared_ptr.)

edited Jun 06 '12 at 14:49

answered Jun 06 '12 at 14:42

Jonathan Wakely

166,810
27
341
521

1

How does GCC know whether the code is/will be multi-threaded or not? – Drew Noakes Nov 17 '13 at 11:34
@DrewNoakes - you have to tell it with a #define. – Martin James Nov 17 '13 at 12:02
Do you have a reference for this? I've done a little searching and can't find one. – Drew Noakes Nov 17 '13 at 12:32
2

@Jonathan Wakely is one of the implementers of libstdc++. Though, I would also find it nice if you could elaborate it a bit more, like what is the define, what is default? – Stephan Dollberg Nov 17 '13 at 12:33
[This question](http://stackoverflow.com/q/15129263/24874) references `BOOST_SP_DISABLE_THREADS`, but I'm wondering if there's a version for `std::shared_ptr` as well. – Drew Noakes Nov 17 '13 at 12:34
6

There is no `#define`. Libstdc++ uses functions which dispatch (at run-time) to an atomic implementation or a non-atomic implementation based on whether the program is linked to `libpthread` or not. If it is linked to `libpthread` then it's assumed that multiple threads exist and the atomic impl is used. If not linked to `libpthread` then the program is single-threaded and so the non-atomic impl is used. – Jonathan Wakely Nov 18 '13 at 11:11
2

There is some brief documentation at http://gcc.gnu.org/onlinedocs/libstdc++/manual/memory.html#std.util.memory.shared_ptr in the "selecting lock policy" section – Jonathan Wakely Nov 18 '13 at 11:15

Overhead and implementation of using shared_ptr

2 Answers2

Linked