There are different ways of providing thread-safety, and which is used is not specified, but it can happen without needing to lock on reads. Short answer, reads don't need locks because of the way it's structured. Long answer:
The promise made in saying "all methods are thread-safe", is that if those methods are being called simultaneously, they will not corrupt the object or return an incorrect result. Let's consider the object List<char>
.
Say we have an empty List<char>
called list
and three threads call the following simultaneously:
char c = list[0]
.
list.Add('a');
list.Add('b');
There are four possible correct results:
c
contains 'a'
. list
contains {'a', 'b'}
. list.Count
would now return 2
.
c
contains 'b'
. list
contains {'b', 'a'}
. list.Count
would now return 2
.
- An
ArgumentOutOfRangeException
is raised on the first thread. list
contains {'a', 'b'}
. list.Count
would now return 2
.
- An
ArgumentOutOfRangeException
is raised on the first thread. list
contains {'b', 'a'}
. list.Count
would now return 2
.
All of these are correct, because a list being a list, we have to have either one inserted item before the other. We also have to have whichever ends up being the first being the value returned by list[0]
. We also have to have thread 1's read treated as happening after this insertion, or before, so we either have that first element put into c
or we have an ArgumentOutOfRangeException
thrown (not a flaw with the thread-safety, looking at the first item of an empty list is out of range, regardless of threading concerns.
Since however List<char>
is not thread-safe, the following can happen:
- One of the above possibilities (we got lucky).
list
contains {'a'}
. list.Count
returns 1. c
contains 'b'
list
contains {'b'}
. list.Count
returns 1. c
contains 'a'
list
contains {'a'}
. list.Count
returns 2. c
contains 'b'
list
contains {'b'}
. list.Count
returns 2. c
contains 'a'
- One or both of the insert calls, and/or the reading call throws an
IndexOutOfRangeException
. (Note, not the ArgumentOutOfRangeException
that is acceptable behaviour for the first thread to raise).
- An assertion failure.
- Any of the above (including the correct behaviour), and then another operation on
list
performed half an hour later raises some exception that List<char>
is not documented as raising for that method.
In other words, we completely messed up the whole thing to the point where list
is in a state that the programmer(s) didn't consider it could possibly be in, and all bets are off.
Okay, so how do we build thread-safe collections.
A simple way is to lock on every method that either changes state, or depends upon state (which tends to be all of them). For static methods we lock on a static field, for instance methods we lock on an instance field (we could lock on an static field, but operations on separate objects would block each other). We make sure that if a public method calls another public method, it only locks once (perhaps by having public methods each calling into a private worker method that doesn't lock). It's possible some analysis would show that we can split all operations into two or more groups that are independent of each other, and have separate locks for each group, but unlikely.
The plus is simplicity, the minus is it locks a lot, perhaps blocking lots of mutually-safe operations from each other.
Another is a shared-exclusive lock (as .NET provides through the misnamed ReaderWriterLock
and ReaderWriterLockSlim
- misnamed because there are other scenarios apart from single-writer mutliple-reader, including some that are actually the exact opposite). It tends to be true that read-only operations (as seen from the inside - a read-only operation that uses memoisation isn't really a read-only operation, because it might update the internal cache) can safely run simultaneously, but write operations must be the sole operation happening, and a shared-exclusive lock lets us ensure that.
Another is through striped locking. This is how the current implementation of ConcurrentDictionary
works. Several locks are created, and each write operation is assigned a lock. While other writes can happen at the same time, those that would hurt each other will be attempting to get the same lock. Some operations may require obtaining all locks.
Another is through lock-free synchronisation (sometimes called "low-lock" because there are features of the primitives used that are similar to locks in some ways but not in others). https://stackoverflow.com/a/3871198/400547 shows a simple lock-free queue, and ConcurrentQueue
uses different lock-free techniques again (in .NET, last time I looked Mono's was close to that example).
While lock-free approaches may automatically sound like the best way to go, it isn't always the case. My own lock-free dictionary at https://github.com/hackcraft/Ariadne/blob/master/Collections/ThreadSafeDictionary.cs does better than ConcurrentDictionary
's striped locking in some cases, but not as well in many others. Why? Well, for a start ConcurrentDictionary
has almost certainly had more hours put into tuning it than I've been able to give mine yet! Seriously though, making sure a lock-free approach is safe requires logic that costs cycles and memory just the same as any other code. Sometimes it outweighs the cost of a lock, and sometimes it doesn't. (Edit: currently, I now beat ConcurrentDictionary
in a wide range of cases, but then the people behind ConcurrentDictioanry
are good, so maybe the balance will return to their favour in more cases with their next update).
Finally though, there's the simple fact that some structures are thread-safe for some operations in and of themselves. int[]
for example is thread-safe in so far as if several threads are reading and writing to and from myIntArray[23]
then while there are a variety of possible results, none of them corrupt the structure and all the reads will see either the initial state, or one of the values that one of the threads were writing. You might though want to use memory barriers to make sure that the reader threads weren't still seeing the initial state long after the last writer was finished due to a stale CPU cache.
Now, what is true of int[]
is also true of object[]
(but note, is not true of an array of a value-type larger than that which the CPU its running on will handle atomically). Since int[]
and object[]
are used internally in some dictionary based classes - Hashtable
and of particular interest here, Cache
being examples - it is possible to construct them in such a way that the structures are naturally threadsafe for single writer threads and multiple reader threads (not for multiple writers, because every now and then a writer has to resize the internal store and that's harder to make threadsafe). This is the case for Hashtable
(IIRC it's also the case for Dictionary<TKey, TValue>
for some types being TKey
and TValue
, but the behaviour is not guaranteed, and most certainly not true for some other types for TKey
and TValue
).
Once you've a structure that is natural thread-safe for multiple readers/single writer then making it thread-safe for multiple readers/multiple writers is a matter of locking on the writes.
End result is, you've got all the benefits of thread-safety, but the costs don't affect readers at all.