9

I have a thread that spins until an int changed by another thread is a certain value.

int cur = this.m_cur;
while (cur > this.Max)
{
    // spin until cur is <= max
    cur = this.m_cur; 
}

Does this.m_cur need to be declared volatile for this to work? Is it possible that this will spin forever due to compiler optimization?

svick
  • 236,525
  • 50
  • 385
  • 514
noctonura
  • 12,763
  • 10
  • 52
  • 85
  • 9
    Make the int a property and signal the thread, (autoResetEvent, maybe), in the setter method. Problem bypassed, CPU use reduced, volatile doubt removed. – Martin James Apr 25 '12 at 00:27
  • This is typically a bad idea except in a few rare cases; do you happen to know how many microseconds maximum you expect to spin for? – Eric Lippert Apr 25 '12 at 06:24
  • CPU-looping while reading 'cur' that is written by another thread will fail to detect cur out-of-limit if the polling thread is not running when the setter thread writes an out-of-limit value. If it has been preempted on an overloaded box, it will have to wait for an average half-a-quantum before detecting the out-of-range cur. If cur becomes back in range again while the poller is not running, it will not detect the out-of-range condition at all. – Martin James Apr 25 '12 at 08:46
  • @EricLippert It will only spin between these 2 lines of code in another thread so it should be super fast. The polling thread that spins is also a BelowNormal priority. if(Interlocked.Add(ref this.m_cur, x) > this.Max) this.m_cur = this.Max; – noctonura Apr 25 '12 at 18:35

3 Answers3

12

Yes, that's a hard requirement. The just-in-time compiler is allowed to store the value of m_cur in a processor register without refreshing it from memory. The x86 jitter in fact does, the x64 jitter doesn't (at least the last time I looked at it).

The volatile keyword is required to suppress this optimization.

Volatile means something entirely different on Itanium cores, a processor with a weak memory model. Unfortunately that's what made it into the MSDN library and C# Language Specification. What it is going to to mean on an ARM core remains to be seen.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
  • 4
    I love how you answered his question, rather than just saying 'do it another way' like so many like to do. Depending on his needs a signal may be far more efficient, but his question wasn't "...or what is better?" Also, I'm sure many of use learned something new here. Thank you. – payo Apr 25 '12 at 01:17
  • +1: By the way, are we sure properties prevent the JIT compiler from doing the lifting optimization. The property would be simple so it would likely get inlined first. I suppose it is possible to do a 2 phase optimization: inline-lift. I haven't seen anything in the specs that would preclude that, but maybe I didn't look that hard. – Brian Gideon Apr 25 '12 at 02:00
  • 1
    @Brian - I discovered it by being puzzled about properties being used in cases where volatile would be required but without any synchronization. BackgroundWorker.CancellationPending is a good example, a bool. This isn't described anywhere I know of, the semantics of *volatile* are awfully poorly documented. They always have been, before .NET too. – Hans Passant Apr 25 '12 at 02:12
  • @payo - which is why I did not provide an answer, merely a suggestion in a comment. Moving the limit-check to the signaling thread means that no inter-thread comms at all is required until the condition is met. – Martin James Apr 25 '12 at 08:16
  • @HansPassant: Okay, I took my usual "lifting optimization" snippet [here](http://stackoverflow.com/a/6164855/158779) and created a `Stop` property that reads/writes to the `stop` variable. Sure enough I can still demonstrate a problem by omitting `volatile` on the underlying field. So it appears that the 2 phase optimization inline-lift does indeed happen. Interesting finding to say the least. I used .NET 3.5 on a single CPU x86 machine. I will try other arrangements and let you know what I observe. – Brian Gideon Apr 25 '12 at 15:08
  • @BrianGideon - I can get it to go wrong too. C# .NET 4 on i7, 64-bit. – Martin James Apr 26 '12 at 09:23
  • @MartinJames: Yeah, mostly the same results here as well. Adding and using property getter and setters for an underlying field does **not** necessarily suppress the lifting optimization. **Marking the underlying field as `volatile` is still compulsory.** – Brian Gideon May 02 '12 at 14:15
  • 1
    @BrianGideon - confirmed. I found the reason this doesn't go wrong for BackgroundWorker, it inherits MarshalByRefObject so the CancellationPending property doesn't get inlined. Thanks for sticking to your point :) – Hans Passant May 02 '12 at 15:21
  • @HansPassant: Wow, nice catch! I just spent an hour trying to figure out why BackgroundWorker was working correctly. I decompiled it, copied the code into a new probject, and tested. I could not get it to go wrong. After I read your comment I removed the base class `Component` and boom...it started to go wrong. I never would have figured that out! I had no idea `MarshalByRefObject` was the culprit! I will added it as a notable mention to my [memory barrier generator list](http://stackoverflow.com/a/6932271/158779). – Brian Gideon May 02 '12 at 15:50
  • @HansPassant: Man...something still isn't adding up. Removing `Component` from my copied `BackgroundWorker` code definitely breaks it. But, I can still demonstrate a problem by subclassing either `Component` or `MarshalByRefObject` on a simpler class. The JIT compiler must have some fairly complex rules on how these optimizations are performed. This is also making me think that there is in fact a really subtle bug in `BackgroundWorker` that is just waiting to manifest itself in a different framework version. Microsoft really should have added `volatile` to that underlying field. – Brian Gideon May 02 '12 at 16:41
  • @Brian - The jitter is smart about MBRO, if it *knows* that the object isn't marshaled then it doesn't suppress optimizations. Exact rules aren't that clear but you definitely need to keep object construction out of the method that uses properties. – Hans Passant May 02 '12 at 16:57
  • @HansPassant: Okay, yeah, you're right. Not sure what I was doing earlier, but I can confirm that `MarshalByRefObject` *does* suppress optimizations on even the simplest classes. It actually seemed to suppress the lifting optimization of a raw field in addition to suppressing property inlining. Interesting. – Brian Gideon May 02 '12 at 18:50
4

The blog below has some fascinating detail on the memory model in c#. In short, it seems safer to use the volatile keyword.

http://igoro.com/archive/volatile-keyword-in-c-memory-model-explained/

From the blog below

class Test
{
    private bool _loop = true;

    public static void Main()
    {
        Test test1 = new Test();

        // Set _loop to false on another thread
        new Thread(() => { test1._loop = false;}).Start();

        // Poll the _loop field until it is set to false
        while (test1._loop == true) ;

        // The loop above will never terminate!
    }
}

There are two possible ways to get the while loop to terminate: Use a lock to protect all accesses (reads and writes) to the _loop field Mark the _loop field as volatile There are two reasons why a read of a non-volatile field may observe a stale value: compiler optimizations and processor optimizations.

David Z.
  • 5,621
  • 2
  • 20
  • 13
  • 1
    It seems even safer to not use polling loops at all. – Martin James Apr 25 '12 at 00:40
  • True for most cases. Though I'd imagine there are reasons to spin lock too, such as avoid the context switch of yielding a thread. – David Z. Apr 25 '12 at 00:49
  • The trouble of avoding context-switches is that it's not always easily possible. Also, the OP code is not a clsssic boolean-flag spinlock. If the box is overloaded because there are more ready threads than cores, the polling thread may get preempted and not run for a while. During this time, it cannot detect 'cur' going out-of-limit. If cur comes back within limit before the poller runs, the out-of-limit condition has been missed entirely. – Martin James Apr 25 '12 at 08:56
0

It depends on how m_cur is being modified. If it's using a normal assignment statement such as m_cur--;, then it does need to be volatile. However, if it's being modified using one of the Interlocked operations, then it doesn't because Interlocked's methods automatically insert a memory barrier to ensure that all threads get the memo.

In general, using Interlocked to modify atomic valued that are shared across threads is the preferable option. Not only does it take care of the memory barrier for you, but it also tends to be a bit faster than other synchronization options.

That said, like others have said polling loops are enormously wasteful. It would be better to pause the thread that needs to wait, and let whoever is modifying m_cur take charge of waking it up when the time comes. Both Monitor.Wait() and Monitor.Pulse() and AutoResetEvent might be well-suited to the task, depending on your specific needs.

Sean U
  • 6,730
  • 1
  • 24
  • 43
  • The polling loop solution doesn't work reliably anyway, at least, not on an overloaded box. The setter thread can write an out-of-range 'cur' while the poller is not running. If 'cur' comes back in range again before the poller runs, the out-of-range condition is not detected at all. – Martin James Apr 25 '12 at 09:08