3

I've been reading about non-blocking approaches for some time. Here is a piece of code for so called lock-free counter.

public class CasCounter {
private SimulatedCAS value;

public int getValue() {
    return value.get();
}

public int increment() {
    int v;
    do {
        v = value.get();
    }
    while (v != value.compareAndSwap(v, v + 1));
    return v + 1;
}

}

I was just wondering about this loop:

do {
    v = value.get();
}
while (v != value.compareAndSwap(v, v + 1));

People say:

So it tries again, and again, until all other threads trying to change the value have done so. This is lock free as no lock is used, but not blocking free as it may have to try again (which is rare) more than once (very rare).

My question is:

How can they be so sure about that? As for me I can't see any reason why this loop can't be infinite, unless JVM has some special mechanisms to solve this.

Max
  • 601
  • 8
  • 21
  • 1
    And what would be the advantages of this approach (if it would work)? – Alex Calugarescu Jan 28 '12 at 09:00
  • 1
    Does compareAndSwap return the old value or the new one? – Tudor Jan 28 '12 at 09:21
  • 1
    @Tudor, I am not sure. In fact I've never tried this myself. However, I read that in different languages could be different implementations: ones return boolean, others - the value should check yourself. – Max Jan 28 '12 at 09:42
  • 1
    @Alex I believe the whole approach works just fine. The advantages of this is that few more loops much cheaper than traditional lock approach where thread rescheduling appears. – Max Jan 28 '12 at 09:45
  • 1
    @Tudor don't count my previous comment. I forgot that it used pseudo CAS class SimulatedCAS. In real life that would be 'AtomicInteger' or 'AtomicLong' which have a method 'incrementAndGet()' – Max Jan 28 '12 at 10:01
  • 1
    @bestsss - would you care to elaborate on that comment? Are you talking about the compareAnd*Set* method on the various Atomic objects? Because the OP isn't talking about that, but a generic CAS operation. – kdgregory Jan 28 '12 at 14:07
  • 1
    @Max I strongly believe this approach does not work at all and the atomicity of a code block can not be guaranteed. Not even the v+1 operation is atomic so supposing that 2 threads enter the call at the same time and do your swap, they will exit your loop and increment your value by only 1. The atomicity of an operation (block of code) is guaranteed by the operating system by using special processor interrupts. At the higher level you have the p() and v() directives which are used to handle those interrupts. Further reading is required on subject to fully understand (no more characters left) – Alex Calugarescu Jan 28 '12 at 15:14
  • 1
    PS: The JVM does not do any "magic" to correct bad synchronization. Maybe this entertaining reading will help understanding threads better : http://en.wikipedia.org/wiki/Producer-consumer_problem – Alex Calugarescu Jan 28 '12 at 15:20
  • 2
    @Alex Calugarescu: v+1 does not need to be atomic because v is a local variable. The code posted is correct. And I don't see how a reading about producer-consumer will help in this case. – Tudor Jan 28 '12 at 16:25
  • 1
    @Tudor : The v+1 was just an example to make a point. The reading about the producer-consumer is a great way to better understand threading and why the code posted here doesn't work. When talking about thread-safety and atomicity, "it could be" and "probably" does not apply really well to robust software. – Alex Calugarescu Jan 28 '12 at 16:50
  • 2
    But the code is correct and safe. I don't see where you are objecting. – Tudor Jan 28 '12 at 16:54
  • 1
    @kdgregory, you have to implement CAS somehow, you have option to either CAS primitive or synchronized(monitor) [that's in effect CAS+more stuff], if drift away from the primitive and go for monitor/synchronized it's already a much higher contention that defeats the purpose. You can talk about STM (software tx memory) but it's usually not impl. like that either, CAS on read is usually expensive. So in the end: while the CAS is usually an exchange in the CPU (i.e. no boolean), in java you end up w/ another branch (either == or !=) just to return the former value. It's inefficient and helps not – bestsss Jan 28 '12 at 17:53
  • 1
    @bestsss - huh? does that rambling comment have any bearing at all on either what the OP said or my question to you? – kdgregory Jan 28 '12 at 22:21
  • 1
    @kdgregory, that's an exact answer to your question. There shall be a way to implement CAS in java. The Original question is marked both java&cas - there is no generic CAS and I thought that was quite clear. While x86/x64 indeed has an exchange CPU operation, it's not universal. LL/SC is more like java (and boolean) – bestsss Jan 28 '12 at 23:55

2 Answers2

3

Edit: I think I have a satisfactory answer now. The bit that confused me was the 'v != compareAndSwap'. In the actual code, CAS returns true if the value is equal to the compared expression. Thus, even if the first thread is interrupted between get and CAS, the second thread will succeed the swap and exit the method, so the first thread will be able to do the CAS.

Of course, it is possible that if two threads call this method an infinite number of times, one of them will not get the chance to run the CAS at all, especially if it has a lower priority, but this is one of the risks of unfair locking (the probability is very low however). As I've said, a queue mechanism would be able to solve this problem.

Sorry for the initial wrong assumptions.

Tudor
  • 61,523
  • 12
  • 102
  • 142
  • saying _should implement an arrival queue_ you mean a developer should care about this or it's already implemented in behind the stage? – Max Jan 28 '12 at 10:05
  • It's not implemented. Repeatedly checking a non-blocking operation is by definition unfair. You can implement it yourself, but I hardly see the benefit, since the code you posted would not be used in practice. – Tudor Jan 28 '12 at 10:13
  • Can't find anything specific in AtomicInteger class implementation though. – Max Jan 28 '12 at 10:13
  • Atomic operations such as these are usually implemented by using an atomicity bit in the messages going from the CPU to the memory module where the value is located. Several messages can be sent in a sequence with the atomicity bit set and in the meantime the memory module will not accept transactions from other threads. – Tudor Jan 28 '12 at 10:16
  • 1
    hm why the code I posted wouldn't be used in practice? It's pretty mush the same as implementation of decrementAndGet() method of AtomicInteger for instance. My initial question still in the place.:) – Max Jan 28 '12 at 10:26
  • Are you sure this is the implementation of degrementAndGet? I would expect that by definition the entire method would be atomic, but, as I've shown, in your code it's possible that another thread can interrupt the operation while another thread is trying to do the compareAndSwap. – Tudor Jan 28 '12 at 10:36
  • my sample is pseudo code which shows the whole logic of CAS using. So method compareAndSwap() of the class SimulatedCAS is atomic, since it's synchronised. Take look into AtomicInteger code [link](http://www.docjar.com/html/api/java/util/concurrent/atomic/AtomicInteger.java.html) – Max Jan 28 '12 at 10:43
  • @Max I've made an edit. Should be clear now. Ignore my initial post. – Tudor Jan 28 '12 at 16:51
  • thx for your answer. Sorry for misleading question. I guess starvation means that all threads can't do anything usefull. I was worried about at least 1 thread doing nothing(which still a problem). However, as you said, the reason, why the probability of that is very low, is that body of loop is very small. Hence, even if there was a context switch, during next quant there will be enough time to execute one loop without interruption. Btw, do you know if there are some instructions underneath, saying that this couple of instructions(one iteration of our loop) should be done uninterrupted? – Max Jan 29 '12 at 07:43
  • @Max, actually, as I've said in my edit, starvation means that only one thread will actually do nothing (for a very limited amount of time), since the non-starved threads will just run the loop only once and exit. Only the starved thread might execute the loop several times. The only way to ensure atomicity of an entire block of code is to use locks (like the synchronized block in java), but that approach is blocking and will only hurt performance. Even with this small starvation problem, the non-blocking code is much faster than using a lock. – Tudor Jan 29 '12 at 09:07
3

The loop can be infinite (since it can generate starvation for your thread), but the likelihood for that happening is very small. In order for you to get starvation you need some other thread succeeding in changing the value that you want to update between your read and your store and for that to happen repeatedly.

It would be possible to write code to trigger starvation but for real programs it would be unlikely to happen.

The compare and swap is usually used when you don't think you will have write conflicts very often. Say there is a 50% chance of "miss" when you update, then there is a 25% chance that you will miss in two loops and less than 0.1% chance that no update would succeed in 10 loops. For real world examples, a 50% miss rate is very high (basically not doing anything than updating), and as the miss rate is reduces, to say 1% then the risk of not succeeding in two tries is only 0.01% and in 3 tries 0.0001%.

The usage is similar to the following problem

Set a variable a to 0 and have two threads updating it with a = a+1 a million times each concurrently. At the end a could have any answer between 1000000 (every other update was lost due to overwrite) and 2000000 (no update was overwritten).

The closer to 2000000 you get the more likely the CAS usage is to work since that mean that quite often the CAS would see the expected value and be able to set with the new value.

Roger Lindsjö
  • 11,330
  • 1
  • 42
  • 53
  • good one. I was thinking that it's all about real statistics and likelihood itself. And I guess the key point here in your words: **The compare and swap is usually used when you don't think you will have write conflicts very often**. And I took the situation when 2 threads did this operation all the time. – Max Jan 29 '12 at 11:31