@Samuel That's exactly the point: You can NOT correct a problem like this just by modifying the existing class. Nevermind that the class extends List<T>
, which in keeping with MS ways mostly avoids virtual methods (only where MS intends us to override do they make it virtual, which mostly makes sense but can make life more difficult in situations where you kinda need a hack). Even if it embedded the List<T>
and all access to it went through code you can change, you could not simply modify this code adding locks to solve the problem.
Why? Well, iterators for one thing. A collection cannot merely not be modified between each access to the collection. It cannot be modified during the entire enumeration.
In reality, synchronization is a complex matter and depends on what the list really is and what sorts of things need to be true at all times in a system.
To illustrate, here's a hack abusing IDisposable. This embeds the list and redeclares whatever functionality of the list that is used somewhere, so that access to the list can be synchronized. In addition, it does not implement IEnumerable. Instead, the only way to get access to enumerate over the list is via a method that returns a disposable type. This type enters the monitor when created and exits it again when disposed. This ensures the list is not accessible during iteration. However, this still isn't really enough, as my example use will illustrate.
First the hacked list:
public class MyCollection
{
object syncRoot = new object();
List list = new List();
public void Add(T item) { lock (syncRoot) list.Add(item); }
public int Count
{
get { lock (syncRoot) return list.Count; }
}
public IteratorWrapper GetIteratorWrapper()
{
return new IteratorWrapper(this);
}
public class IteratorWrapper : IDisposable, IEnumerable<T>
{
bool disposed;
MyCollection<T> c;
public IteratorWrapper(MyCollection<T> c) { this.c = c; Monitor.Enter(c.syncRoot); }
public void Dispose() { if (!disposed) Monitor.Exit(c.syncRoot); disposed = true; }
public IEnumerator<T> GetEnumerator()
{
return c.list.GetEnumerator();
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
}
Then a console app using it:
class Program
{
static MyCollection strings = new MyCollection();
static void Main(string[] args)
{
new Thread(adder).Start();
Thread.Sleep(15);
dump();
Thread.Sleep(125);
dump();
Console.WriteLine("Press any key.");
Console.ReadKey(true);
}
static void dump()
{
Console.WriteLine(string.Format("Count={0}", strings.Count).PadLeft(40, '-'));
using (var enumerable = strings.GetIteratorWrapper())
{
foreach (var s in enumerable)
Console.WriteLine(s);
}
Console.WriteLine("".PadLeft(40, '-'));
}
static void adder()
{
for (int i = 0; i < 100; i++)
{
strings.Add(Guid.NewGuid().ToString("N"));
Thread.Sleep(7);
}
}
}
Note the "dump" method: It access Count, which is meaninglessly locking the list in an attempt to make it "thread safe", and then iterates through the items. But there is a race condition between the Count getter (which locks, gets the count, then releases) and the using statement. So it may not dump the number of items it things it does.
Here, it may not matter. But what if the code instead did something like:
var a = new string[strings.Count];
for (int i=0; i < strings.Count; i++) { ... }
Or even more likely to mess things up:
var n = strings.Count;
var a = new string[n];
for (int i=0; i < n; i++) { ... }
The former will blow up if items are concurrently added to the list. The latter is not going down well if items are removed from the list. And in both cases the semantics of the code may not be unaffected by changes to the list, even if the changes don't cause the code to crash. In the first case, perhaps items are removed from the list, causing the array not to be filled. Subsequently something far removed in code crashes because of null values in the array.
The lesson to take from this is: Shared state can be very tricky. You need an upfront plan and you need to have a strategy for how you're going to ensure the meaning is right.
In each of these cases, correct operation would be achieved only by making sure the lock spans all the related operations. There could be lots of other statements in between and the lock could span many method calls and/or involve many different objects. There is no magic bullet that allows these problems to be fixed merely by modifying the collection itself, because the correct synchronization depends on what the collection means. Something like a cache of readonly objects can probably be synchronized only with respect to add/remove/lookup
, but if the collection itself is supposed to represent some meaningful/important concept, this will never be sufficient.