44

Lets just say you have a simple operation that runs on a background thread. You want to provide a way to cancel this operation so you create a boolean flag that you set to true from the click event handler of a cancel button.

private bool _cancelled;

private void CancelButton_Click(Object sender ClickEventArgs e)
{
    _cancelled = true;
}

Now you're setting the cancel flag from the GUI thread, but you're reading it from the background thread. Do you need to lock before accessing the bool?

Would you need to do this (and obviously lock in the button click event handler too):

while(operationNotComplete)
{
    // Do complex operation

    lock(_lockObject)
    {
        if(_cancelled)
        {
            break;
        }
    }
}

Or is it acceptable to do this (with no lock):

while(!_cancelled & operationNotComplete)
{
    // Do complex operation
}

Or what about marking the _cancelled variable as volatile. Is that necessary?

[I know there is the BackgroundWorker class with it's inbuilt CancelAsync() method, but I'm interested in the semantics and use of locking and threaded variable access here, not the specific implementation, the code is just an example.]

There seems to be two theories.

1) Because it is a simple inbuilt type (and access to inbuilt types is atomic in .net) and because we are only writing to it in one place and only reading on the background thread there is no need to lock or mark as volatile.
2) You should mark it as volatile because if you don't the compiler may optimise out the read in the while loop because it thinks nothing it capable of modifying the value.

Which is the correct technique? (And why?)

[Edit: There seem to be two clearly defined and opposing schools of thought on this. I am looking for a definitive answer on this so please if possible post your reasons and cite your sources along with your answer.]

Simon P Stevens
  • 27,303
  • 5
  • 81
  • 107
  • 6
    Yes, you *do* need either `volatile` or `lock` (to act as a memory barrier): see http://stackoverflow.com/questions/458173/can-a-c-thread-really-cache-a-value-and-ignore-changes-to-that-value-on-other-th/458193#458193 – Marc Gravell Aug 03 '09 at 13:02
  • 1
    Does this mean that Simon is going to be left red faced in the mass debate he has been having with EFraim (http://stackoverflow.com/questions/1221839/c-break-out-of-loop-on-button-press/1221854#1221854) – Lloyd Powell Aug 03 '09 at 13:03
  • 1
    @Marc: Nice example, thanks. @ThePower: I'm happy to put my hands up when I'm not sure, so hopefully I'll escape only mildly pink faced =:) – Simon P Stevens Aug 03 '09 at 13:10
  • 2
    Joe Duffy's "Concurrent Programming on Windows" is getting added to my booklist! – Mitch Wheat Aug 03 '09 at 13:11
  • @Mitch: It's a great book, lots of content, though he does occasionally ramble a bit. :) – Greg D Aug 03 '09 at 13:18

5 Answers5

40

Firstly, threading is tricky ;-p

Yes, despite all the rumours to the contrary, it is required to either use lock or volatile (but not both) when accessing a bool from multiple threads.

For simple types and access such as an exit flag (bool), then volatile is sufficient - this ensures that threads don't cache the value in their registers (meaning: one of the threads never sees updates).

For larger values (where atomicity is an issue), or where you want to synchronize a sequence of operations (a typical example being "if not exists and add" dictionary access), a lock is more versatile. This acts as a memory-barrier, so still gives you the thread safety, but provides other features such as pulse/wait. Note that you shouldn't use a lock on a value-type or a string; nor Type or this; the best option is to have your own locking object as a field (readonly object syncLock = new object();) and lock on this.

For an example of how badly it breaks (i.e. looping forever) if you don't synchronize - see here.

To span multiple programs, an OS primitive like a Mutex or *ResetEvent may also be useful, but this is overkill for a single exe.

Community
  • 1
  • 1
Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • I think this settles the issue. A reproducible example there of how it can go wrong if you don't use lock or volatile. – Simon P Stevens Aug 03 '09 at 13:47
  • @MarcGravell♦ are there any chances that 2 threads write to the volatile variable at the exact same time ? – onmyway133 Jul 09 '13 at 04:11
  • 1
    @entropy yes, there is nothing to stop that whatsoever. One of the values will win; it is not defined which. Note that the compiler only lets you use `volatile` with types that can be written atomically, so you won't end up with a torn value – Marc Gravell Jul 09 '13 at 06:41
6

_cancelled must be volatile. (if you don't choose to lock)

If one thread changes the value of _cancelled, other threads might not see the updated result.

Also, I think the read/write operations of _cancelled are atomic:

Section 12.6.6 of the CLI spec states: "A conforming CLI shall guarantee that read and write access to properly aligned memory locations no larger than the native word size is atomic when all the write accesses to a location are the same size."

bruno conde
  • 47,767
  • 15
  • 98
  • 117
  • @Mitch: but in this case this means it can screw things up. – EFraim Aug 03 '09 at 13:05
  • Just as a point of correction, locking will have no effect upon whether a read or write to a variable is cached. Locking is NOT a replacement for a volatile variable. – Adam Robinson Aug 03 '09 at 13:08
  • atomicity has little to do with it... if the other thread is still chewing on its own registers then it makes zero difference whether you write the value in 1 chunk or in 26. – Marc Gravell Aug 03 '09 at 13:09
  • 2
    @Adam - the lock acts, I believe, as a memory barrier - serving the same purpose. – Marc Gravell Aug 03 '09 at 13:10
  • @Marc: Monitor.Enter (what lock() does, as you know but others may not) only acquires an exclusive lock, at least according to the documentation. I'm not sure how it could serve as a memory barrier, at least certainly on anything other than what is being locked. The logic required to do that would have to be forward-looking (to see WHAT needs to be protected, since "flushing" all cache and writes would seem to be inefficient) and at a casual glance doesn't really seem possible to my admittedly small mind ;) Is there information somewhere that confirms this? – Adam Robinson Aug 03 '09 at 13:16
  • @Adam & Marc: Any examples on this topic. Is a lock and volatile necessary? – Simon P Stevens Aug 03 '09 at 13:16
  • @Simon: Marc may have more information on this, but from my limited knowledge on the subject, a lock (which is only to be applied to a reference type) is just a means by which you can easily ensure that only one thread is accessing a particular reference at a time. A volatile field is used to ensure that reads and writes to that field (assigning the value directly) go against the actual memory location and not a thread-specific cache. – Adam Robinson Aug 03 '09 at 13:19
  • 1
    Using lock() yields the same guarantees as making a field volatile. – Daniel Brückner Aug 03 '09 at 13:20
  • @Marc, Daniel: Thanks for educating me on this. After doing some reading, I realize now that the semantics of the lock (in that it performs a volatile read on the locked object at the start, then a volatile write at the end) mean that the value of the referenced variable MUST be reread at least once after the lock is initiated. I still think it's probably a good idea to mark a variable that's used in this manner as volatile so that it's painfully clear (and some subtlety of the jitter doesn't bite you, as I'm sure is bound to happen someday). Thanks! – Adam Robinson Aug 03 '09 at 13:28
5

Locking is not required because you have a single writer scenario and a boolean field is a simple structure with no risk of corrupting the state (while it was possible to get a boolean value that is neither false nor true). But you have to mark the field as volatile to prevent the compiler from doing some optimizations. Without the volatile modifier the compiler could cache the value in a register during the execution of your loop on your worker thread and in turn the loop would never recognize the changed value. This MSDN article (How to: Create and Terminate Threads (C# Programming Guide)) addresses this issue. While there is need for locking, a lock will have the same effect as marking the field volatile.

ΩmegaMan
  • 29,542
  • 12
  • 100
  • 122
Daniel Brückner
  • 59,031
  • 16
  • 99
  • 143
  • That's right. Since you either read or set the value of this flag, all you need is a `volatile` attribute. – jpbochi Aug 03 '09 at 13:13
  • But in theory you could have a quite weired scenario if the update of the boolean field is not atomic and a partial update results in a value that is neither true nor false, but this would cause no problems in the given usage scenario. – Daniel Brückner Aug 03 '09 at 13:18
  • @Daniel - a bool update is guaranteed to be atomic by the spec. – Marc Gravell Aug 03 '09 at 13:28
  • 1
    ECMA334v4 §12.5 Atomicity of variable references Reads and writes of the following data types shall be atomic: bool, char, byte, sbyte, short, ushort, uint, int, float, and reference types. In addition, reads and writes of enum types with an underlying type in the previous list shall also be atomic. – Marc Gravell Aug 03 '09 at 13:29
  • Thanks for that. Just had a look at RFC 2119 and noticed that I tended to interpret "shall" like "should" because it similar to the German word for "should". – Daniel Brückner Aug 03 '09 at 13:42
2

For thread synchronization, it's recommended that you use one of the EventWaitHandle classes, such as ManualResetEvent. While it's marginally simpler to employ a simple boolean flag as you do here (and yes, you'd want to mark it as volatile), IMO it's better to get into the practice of using the threading tools. For your purposes, you'd do something like this...

private System.Threading.ManualResetEvent threadStop;

void StartThread()
{
    // do your setup

    // instantiate it unset
    threadStop = new System.Threading.ManualResetEvent(false); 

    // start the thread
}

In your thread..

while(!threadStop.WaitOne(0) && !operationComplete)
{
    // work
}

Then in the GUI to cancel...

threadStop.Set();
yfeldblum
  • 65,165
  • 12
  • 129
  • 169
Adam Robinson
  • 182,639
  • 35
  • 285
  • 343
  • An OS primitive is usually overkill for a single app, IMO. Most things can be done with just a `Monitor`. – Marc Gravell Aug 03 '09 at 13:26
  • @Marc: While a *ResetEvent may not be explicitly required here, I don't think I'd be willing to say that it's overkill for a single application. There are many instances where multiple threads within one application may need to use the functionality of a EventWaitHandle like ManualResetEvent. – Adam Robinson Aug 03 '09 at 13:47
1

Look up Interlocked.Exchange(). It does a very fast copy into a local variable which can be used for comparison. It is faster than lock().

hughdbrown
  • 47,733
  • 20
  • 85
  • 108
  • +1, not always the best but it is one of the three (lock/interlocked/volatile) possibilities. – H H Aug 03 '09 at 13:21
  • Interlocked.Exchange() would not be my first choice for a variable used to control a loop. I'd use an Event/WaitHandle. If you need to get a single integer/boolean variable atomically, it is a good choice. – hughdbrown Aug 03 '09 at 13:33