3

(Assume 64-bit x86-64 architecture and Intel 3rd/4th generation CPU)

Here is a lock-free implementation for a stack from Concurrency in Action book, page 202:

template<typename T>
class lock_free_stack
{
private:
    struct node;

    struct counted_node_ptr
    {
        int external_count;
        node* ptr;
    };

    struct node
    {
        std::shared_ptr<T> data;
        std::atomic<int> internal_count;
        counted_node_ptr next;

        node(T const& data_):data(std::make_shared<T>(data_)),internal_count(0){}
    };

    std::atomic<counted_node_ptr> head;

public:
    ~lock_free_stack()
    {
        while(pop());
    }

    void push(T const& data)
    {
        counted_node_ptr new_node;
        new_node.ptr=new node(data);
        new_node.external_count=1;
        new_node.ptr->next=head.load();
        while(!head.compare_exchange_weak(new_node.ptr->next,new_node));
    }
};

It says below the code:

On those platforms that support a double-word-compare-and-swap operation, this structure will be small enough for std::atomic to be lock-free.

I believe x86-64 does have support for the double CAS (I cannot remember the name of the instruction off the top of my head).

If I were to check the assembly (and I couldn't see the double CAS instruction) what inline assembly function would I need to write to ensure double-CAS is used?

UPDATE - I think I have found what I was looking for here:

http://blog.lse.epita.fr/articles/42-implementing-generic-double-word-compare-and-swap-.html

template<typename T>
struct DPointer <T,sizeof (uint64_t)> {
public:
  union {
    uint64_t ui[2];
    struct {
      T* ptr;
      size_t count;
    } __attribute__ (( __aligned__( 16 ) ));
  };

  DPointer() : ptr(NULL), count(0) {}
  DPointer(T* p) : ptr(p), count(0) {}
  DPointer(T* p, size_t c) : ptr(p), count(c) {}

  bool cas(DPointer<T,8> const& nval, DPointer<T,8> const& cmp)
  {
    bool result;
    __asm__ __volatile__ (
        "lock cmpxchg16b %1\n\t"
        "setz %0\n"
        : "=q" ( result )
         ,"+m" ( ui )
        : "a" ( cmp.ptr ), "d" ( cmp.count )
         ,"b" ( nval.ptr ), "c" ( nval.count )
        : "cc"
    );
    return result;
  }

  // We need == to work properly
  bool operator==(DPointer<T,8> const&x)
  {
    return x.ptr == ptr && x.count == count;
  }
};
user997112
  • 29,025
  • 43
  • 182
  • 361
  • 1
    You don't need inline-asm. Modern gcc (with `-mcx16`) will use `LOCK CMPXCHG16B` when compiling a `compare_exchange_weak` on a 16B object like `std::atomic`. **See [this answer](http://stackoverflow.com/questions/38984153/implement-aba-counter-with-c11-cas/38991835#38991835) where I included code to do it with (mostly) portable C++11.** – Peter Cordes Oct 26 '16 at 02:44
  • I can close this question as a duplicate of that one, if you think that's appropriate (let me know). – Peter Cordes Oct 26 '16 at 02:46

2 Answers2

2

The oldest versions of the x86_64 do not support this instruction (CMPXCHG16B), which is required for Windows 8.1/64-bit and newer. Afaik this is most of the Athlon64 range (socket 751, 939 and some of the X2's, maybe the first generation (8xx) of Pentium D too)

How to force a compiler to use a certain instruction varies, usually one must use a not wholly portable intrinsic.

Marco van de Voort
  • 25,628
  • 5
  • 56
  • 89
  • "Assume 64-bit x86-64 architecture and Intel 3rd/4th generation CPU" :) – user997112 May 31 '14 at 16:01
  • @user997112 You can certainly assume it, but you need to tell the compiler to assume it too. How you do so depends on the compiler you are using, which you haven't told us. – Alan Stokes May 31 '14 at 19:04
  • 1
    True. I responded to the later line " x86-64 does have support for the double CAS". IOW I meant to state that it is not a generic architecture feature, and thus not part of the base instruction and featureset of compilers. Though in general I don't think you can rely on the compiler reliably doing such things without additional specifiers, so that point is a bit moot. – Marco van de Voort May 31 '14 at 22:25
1

You can assert

std::atomic<T>::is_lock_free()
Petter
  • 11
  • 1