2

I read the following note in Chapter 7 of the book C++ Concurrency in Action (Page 194). My question is, is there any allocator in the standard library, for which what we do in Hazard Pointers is well defined?

Using hazard pointers like this relies on the fact that it's safe to use the value of a pointer after the object it references has been deleted. This is technically undefined behavior if you are using the default implementation of new and delete, so either you need to ensure that your implementation permits it, or you need to use a custom allocator that permits such usage.

Here is a relevant post to the undefined/unspecified behavior I am talking about here. The piece of code the above note is referring to is the following:

std::shared_ptr<T> pop() {
  std::atomic<void*>& hp=get_hazard_pointer_for_current_thread();
  node* old_head=head.load();
  node* temp;
  do {
    temp=old_head;         // Here we are reading (not dereferencing) a pointer that might have been deleted already
    hp.store(old_head);    // And here
    old_head=head.load();
  } while(old_head!=temp); // And here
  // ...
}

In fact, I would say the same problem exists when Reference Counting is used instead of Hazard Pointers (the book does not make such a claim). The following is how one pushes an element into a stack using reference counting (Page 208). The whole purpose of the while-loop in this code is to make sure value of new_node.ptr->next is valid, and that involves reading a pointer that might have been already deleted.

void push(T const& data) {
  counted_node_ptr new_node;
  new_node.ptr=new node(data);
  new_node.external_count=1;
  new_node.ptr->next=head.load(std::memory_order_relaxed)
  while(!head.compare_exchange_weak(
                new_node.ptr->next,
                new_node,
                std::memory_order_release,
                std::memory_order_relaxed)); }
Koosha
  • 1,492
  • 7
  • 19

1 Answers1

3

This is the so called "memory reclamation problem" - an object must not be deleted as long as some thread may still hold a reference to it and thus might still access it. If a thread wants to "release" some object, it usually does not call delete directly but some retire function that stores the pointer and defers reclamation until later when it is safe (i.e., in case of hazard pointers until no thread holds a HP to this object). Since scanning the hazard pointers of all threads is rather expensive this is only done once the number of retired objects reaches some threshold. This way it is absolutely possible to implement hazard pointers with well defined behavior.

The code of your pop operation is missing some important details - you are about to replace head, but what do you do with the old value? How do you reclaim it? Do you simply call delete?

Arch D. Robison proposed N3712 : Policy-Based Design for Safe Destructionin Concurrent Containers as a generic interface for memory reclamation schemes for the C++ standard. The general idea is that you have the concept of a guard_ptr that holds a safe reference to the object (similar to a share_ptr). As long as any thread holds such a guard_ptr to an object, this object cannot be deleted, and you may only access the object after acquiring such a guard_ptr. If you want to reclaim an object you simply call guard_ptr::retire. Then the object will eventually be deleted by the underlying implementation once it is safe.

Reference counting is a whole different story, because it is not possible to load the pointer and increase the ref counter in a single atomic operation. That's why with lock-free reference counting it is never possible to return a retired object to the memory manager, because the ref counter has to be available indefinitely. Instead, the nodes are usually kept in a free-list and reused.

Regarding your push operation that uses reference counting - this is perfectly safe. The while loop keeps trying to update head with new_node, where new_node.ptr->next is the expected value. If the CAS fails, the current value is stored in new_node.ptr->next, but this is fine because since the CAS has failed new_node is not visible, so no other thread can access it. Where do you think we are reading a pointer that might have been already deleted.

If you are interested in this topic I can refer you to my thesis Effective memory reclamation for lock-free data structures in C++. It not only provides a general discussion about a large number of reclamation schemes, but also covers in great detail my own implementation of various reclamation schemes based on an adapted version of the previously mentioned interface proposed by Arch D. Robison.

Based on my work for this thesis I started an open source library that provides implementations of various reclamation schemes (including hazard pointers and lock-free reference counting) and concurrent data structures: xenium

Update
Regarding your statement that "reading the value of (and not dereferencing) an already deallocated pointer is in general undefined" - I don't have that book, so I am not sure what the author refers to. And I don't know why the authro in the referenced SO answer states that '"Using" includes "copying the value of"'. The C++17 standard says the following about invalid pointer values:

Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior. Any other use of an invalid pointer value has implementation-defined behavior.

And as a footnote:

Some implementations might define that copying an invalid pointer value causes a system-generated runtime fault.

The reason for the possibility of a potential runtime fault is that some architectures in the past used dedicated address registers for pointer loads and stores and they could fault if, for example, a segment number in a pointer was not currently mapped. However, I am not aware of any current architecture or compiler on which reading an invalid pointer value would cause such a fault.

The pointer_safety concept was added in C++11 as part of the garbage collector interface (see the N2670: Minimal Support for Garbage Collection and Reachability-Based Leak Detection proposal). The book "The C++ Programming Language" (4th edition) by Bjarne Stroustrup says the following about the pointer_safety values:

  • relaxed: Safely-derived and not safely-derived pointers are treated equivalent. Collect every object that does not have a safely derived or traceable pointer to it.
  • preferred: Like relaxed, but a garbage collector may be running as a leak detector and/or a detector of dereferences of "bad pointers".
  • strict: Safely-derived and not safely-derived pointers may be treated differently; that is, a garbage collector may be running and will ignore pointers that are not safely derived.

So the whole pointer_safety is only relevant when you use a garbage collector.

mpoeter
  • 2,574
  • 1
  • 5
  • 12
  • The question does not ask what Safe Memory Reclamation (SMR) is about. It asks about a **technical difficulty** in a typical C++ implementation of Hazard Pointers which is a method that is used for SMR, so is Reference Counting. – Koosha May 08 '20 at 11:35
  • I know, that's why I tried to explain how this can be done without running into undefined behavior, but apparently I did a rather poor job. Let me try to rephrase my answer... – mpoeter May 08 '20 at 12:28
  • I have updated my answer to provide more details. Let me know if it still doesn't answer your question, and if so, perhaps try to explain what you are missing/looking for. – mpoeter May 08 '20 at 13:03
  • Thanks for the response. Unfortunately, it is far from the answer I am looking for. My question is not about what SMR is, nor it is about how HP or RC is implemented. And it is not about a code that I wrote or found somewhere, and now I am trying to understand/verify it. The question is also not about how do I know if it is safe to dereference a pointer or delete one. – Koosha May 08 '20 at 13:20
  • The question says according to the given quote **reading the value of (and not dereferencing)** an already deallocated pointer is in general undefined. However, reading a pointer value (and not necessarily dereferencing it) is required in HP, and from what I see, it is also required in RC. So is there any allocator class shipped with the compiler (eg. clang or gcc), in which it is OK to read the value of those pointers (without dereferencing them)? – Koosha May 08 '20 at 13:20
  • Reading a pointer that _points_ to an already deleted object is perfectly fine and definitely not undefined behavior. Only _dereferencing_ such a pointer is UB. The problem with RC is that you have to do exactly that - the ref counter is embedded in the object, so you have to dereference the pointer to increase the ref count. That's why LFRC is a bit special (also in my library). But with regards to the other reclamation schemes you can use any allocator you want - you just have to take care that the objects are not prematurely deleted. – mpoeter May 08 '20 at 13:31
  • As for "the ref counter is embedded in the object", it most definitely does not have to be like that, and indeed it is not in the book I cited in this post. As for "Reading a pointer that points to an already deleted object is perfectly fine", it is also not true. At least according the what I quoted up there, both from the book and from another post in stack-overflow. You can also find this at "6.7.4.3 Safely-derived pointers [basic.stc.dynamic.safety]" from the C++ Standard (ISO/IEC 2017 - Page 73 - Paragraph 4). – Koosha May 08 '20 at 13:38
  • You are missing the point - it is perfectly safe to read a pointer to an already deleted object, it is just that _you cannot do anything with it_. So you have to _recognize_ this scenario and perform a retry to load the updated pointer until you have a _safe_ reference (i.e., to an object that has not yet been deleted) - that is what I meant by "acquiring a guard_ptr". – mpoeter May 08 '20 at 13:46
  • "it is perfectly safe to read a pointer to an already deleted object", why do you think that is the case? I cited three different sources saying otherwise, and one of those is the **C++ Standard**. – Koosha May 08 '20 at 13:49
  • I guess the answer is `std::get_pointer_safety()`. If it returns `std::pointer_safety::strict`, then reading the value of an already deleted pointer is undefined. Otherwise, it is well-defined. I don't have a good understanding of `std::pointer_safety`, and I am not sure if my guess is correct. – Koosha May 08 '20 at 13:56
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/213426/discussion-between-mpoeter-and-koosha). – mpoeter May 08 '20 at 15:22
  • @Koosha I have updated my answer once more - I hope it now contains all the relevant information you were looking for. – mpoeter May 10 '20 at 11:47
  • Thanks for the answer. It makes sense. By the way, the 2nd edition of the book is available online. Manning published it itself at https://www.manning.com/books/c-plus-plus-concurrency-in-action-second-edition (see Section 7.2.3 for my quote). – Koosha May 10 '20 at 22:18
  • Thanks for sharing your thesis, it is most informative, +1. – Maxim Egorushkin Jun 03 '20 at 23:24
  • @MaximEgorushkin Thanks! Let me know if you have any questions about it. :) – mpoeter Jun 04 '20 at 13:06