3

This is a question not about how LongAdder works, it's about an intriguing implementation detail that I can't figure out.

Here is the code from Striped64 (I've cut out some parts and left the relevant parts for the question):

    final void longAccumulate(long x, LongBinaryOperator fn,
                          boolean wasUncontended) {
    int h;
    if ((h = getProbe()) == 0) {
        ThreadLocalRandom.current(); // force initialization
        h = getProbe();
        wasUncontended = true;
    }
    boolean collide = false;  // True if last slot nonempty
    for (;;) {
        Cell[] as; Cell a; int n; long v;
        if ((as = cells) != null && (n = as.length) > 0) {
            if ((a = as[(n - 1) & h]) == null) {
                //logic to insert the Cell in the array
            }
            // CAS already known to fail
            else if (!wasUncontended)   {
                wasUncontended = true;      // Continue after rehash
            }
            else if (a.cas(v = a.value, ((fn == null) ? v + x : fn.applyAsLong(v, x)))){
                break;
            }

A lot of things from code are clear to me, except for the :

        // CAS already known to fail
        else if (!wasUncontended)   {
            wasUncontended = true;      // Continue after rehash
        }

Where does this certainty that the following CAS will fail? This is really confusing for me at least, because this check only makes sense for a single case : when some Thread enters the longAccumulate method for the n-th time (n > 1) and the busy spin is at it's first cycle.

It's like this code is saying : if you (some Thread) have been here before and you have some contention on a particular Cell slot, don't try to CAS your value to the already existing one, but instead rehash the probe.

I honestly hope I will make some sense for someone.

Eugene
  • 117,005
  • 15
  • 201
  • 306

1 Answers1

4

It's not that it will fail, it's more that it has failed. The call to this method is done by the LongAdder add method.

public void add(long x) {
    Cell[] as; long b, v; int m; Cell a;
    if ((as = cells) != null || !casBase(b = base, b + x)) {
        boolean uncontended = true;
        if (as == null || (m = as.length - 1) < 0 ||
            (a = as[getProbe() & m]) == null ||
            !(uncontended = a.cas(v = a.value, v + x)))
            longAccumulate(x, null, uncontended);
    }
}
  1. The first set of conditionals is related to existence of the long Cells. If the necessary cell doesn't exist, then it will try to accumulate uncontended (as there was no attempt to add) by atomically adding the necessary cell and then adding.
  2. If the cell does exist, try to add (v + x). If the add failed then there was some form of contention, in that case try to do the accumulating optimistically/atomically (spin until successful)

So why does it have

wasUncontended = true;      // Continue after rehash

My best guess is that with heavy contention, it will try to give the running thread time to catch up and will force a retry of the existing cells.

John Vint
  • 39,695
  • 7
  • 78
  • 108
  • 3
    1) I know where this comes from, but thx for adding it either way. 2) Damn it! I thought that *CAS already known to fail* meant that the next CAS operation will fail, not the previous one. I like your guess a lot, they probably had tests that prove this as being an optimal addition to the code. Your input is very much appreciated. – Eugene Dec 13 '16 at 14:00
  • yes, I am 99% sure about your theory now (after looking at the code again). It is counter intuitive to me, but may be it makes instance sense for someone maintaining that code. Again, thank you and I will accept this. – Eugene Dec 13 '16 at 14:15
  • Yup that's the most sensible explanation. Try a tight contended loop of incrementing an integer with CAS sometime. Under heavy contention you can easily have 90+% unsuccessful CAS. – Voo Dec 13 '16 at 17:45
  • @Voo that might be true about 90% (not going to argue), but I wonder *how* the decision was made to *where* actually to retry some of the slots (after a rehash) instead of CAS; because as the code is this is only tried once. As I said, I liked the idea and the answer, but a follow-up from *someone* who actually wrote or reviewed the code would be great. – Eugene Dec 13 '16 at 21:07
  • @Voo and my pseudo-rant here comes from the fact that measuring this and *knowing* where it is accurate to insert such a thing would be a nightmare on different cpus, number of cores, etc. Doug Lea is one hell of a smart guy - that's the only plausible explanation I can think of. :) – Eugene Dec 13 '16 at 21:10