1

Consider the following code block

public class Data
{
    public bool Init { get; set; }
    public string Value {get; set; }
}

public class Example
{

    private Object myObject = new Object();
    private List<Data> myList = new List<Data>
    {
       new Data { Init = true, Value = "abc" },
       new Data { Init = false, Value = "def" },
    };

    public IEnumerable<string> Get()
    {
        lock(this.myObject)
        {
            return this.myList.Where(i => i.Init == true).Select(i => i.Value);
        }
    }

    public void Set(string value)
    {
        lock (this.myObject)
        {
            this.myList.Add(new Data { Init = false, Value = value });
        }
    }
}

If multiple threads are calling Get() - will the method be thread-safe?

In addition - will invoking .ToList() at the linq query will make it thread-safe?

return this.myList.Where(i => i.Init == true).Select(i => i.Value).ToList()

NirMH
  • 4,769
  • 3
  • 44
  • 69
  • 1
    What exactly are you intending on locking? Also remember that Linq uses Deferred Execution. – gunr2171 Jan 09 '23 at 13:09
  • 3
    No. This is not safe. 1) The set doesn't acquire the lock, 2) The actual iteration of the list occurs outside of the lock – canton7 Jan 09 '23 at 13:10
  • 1
    Maybe you should look into built-in thread-safe solutions: https://stackoverflow.com/q/15400133/861716 – Gert Arnold Jan 09 '23 at 13:13
  • (although there's no better solution for list + lock if what you need it a locked list) – canton7 Jan 09 '23 at 13:19
  • Note that while inserting a `.ToList()` might make this thread safe with regards to the list, there is no telling if the complete program will be thread safe. You might still have read-modify-write issues since the list could be modified between get and set calls. – JonasH Jan 09 '23 at 13:31
  • 1
    Is this thread safe or not depends on the usage scenario and what guarantees you need. What exactly would be done from multiple threads? – Guru Stron Jan 09 '23 at 13:41

2 Answers2

3

Note that you do not lock here:

public void Set(string value)
{
    this.myList.Add(new Data { Init = false, Value = value });
}

So it's not thread-safe in any case.

Assuming you just forgot to do that - it's still not safe because Get returns "lazy" IEnumerable. It holds a reference to myList and it will enumerate it only when returned IEnumerable itself will be enumerated. So you are leaking reference to myList you are trying to protect with lock, outside of lock statement to arbitrary code.

You can test it like this:

var example = new Example();
var original = example.Get();
example.Clear(); // new method which clears underlying myList
foreach (var x in original)
    Console.WriteLine(x);

Here we call Get, then we clear myList and then we enumerate what we got from Get. Naively one may assume original will contain original 2 items we had, but it will not contain anything, because it's evaluated only when we enumerate original, and at that point in time - list has already been cleared and is empty.

If you use

public IList<string> Get()
{
    lock(this.myObject)
    {
        return this.myList.Where(i => i.Init == true).Select(i => i.Value).ToList();
    }
}

Then it will be "safe". Now you return not "lazy" IEnumerable but a new instance of List<> with copies of values you have in myList. Note that it's a good idea to change return type to IList here, otherwise caller might pay additional overhead (like calling ToArray or ToList which makes a copy) while it's not necessary in this case.

Evk
  • 98,527
  • 8
  • 141
  • 191
1

You have to be aware about the difference between the potential to enumerate a sequence (= the IEnumerable) and the enumerated sequence itself (the List, Array, etc after you enumerated the sequence.

So you have a class Example, which internally holds a List<Data> in member MyList. Every Data has at least a string property 'Value`.

Class Example has Methods to extract Value, and to add new elements to the MyList.

I'm not sure if it is wise to call them Set and Get, these names are quite confusing. Maybe you've simplified your example (which by the way made it more difficult to talk about it).

You have an object of class Example and two threads, that both have access to this object. You worry, that while one thread is enumerating the elements of the sequence, that the other thread is adding elements of the sequence.

Your Get method returns the "potential to enumerate". The sequence is not enumerated yet after you return from Get, and after the Lock is disposed.

This means, that when you start enumerating the sequence, the Data is not locked anymore. If you've ever returned an IEnumerable from data that you fetched from a database, you probably have seen the same problem: the connection to the database is disposed before you start enumerating.

Solution 1: return enumerated data: inefficient

You already mention one solution: enumerate the data in a List before you return. This way, property MyList will not be accesses after you return from Get, so the lock is not needed anymore:

public IEnumerable<string> GetInitializedValues()
{
    lock(this.MyList)
    {
        return this.MyList
            .Where(data => data.Init == true)
            .Select(data => data.Value)
            .ToList();
    }
}

In words: Lock MyList, which is a sequence of Data. Keep only those Data in this sequence that have a true value for property Init. From every remaining Data, take the value of property Value and put them in a List. Dispose the lock and return the List.

This is not efficient if the caller doesn't need the complete list.

// Create an object of class Example which has a zillion Data in MyList:
Example example = CreateFilledClassExample();

// I only want the first element of MyList:
string onlyOneString = example.GetInitializedValues().FirstOrDefault();

GetInitializedValues creates a list of zillion elements, and returns it. The caller only takes the first initialized value and throws the rest of the list away. What a waste of processing power.

Solution 2: use yield return: only enumerate what must be enumerated

The keyword yield means something like: return the next element of the sequence. Keep everything alive, until the caller disposes the IEnumerator

public IEnumerable<string> GetInitializedValues()
{
    lock(this.MyList)
    {
        IEnumerable<string> initializedValues = this.MyList
            .Where(data => data.Init == true)
            .Select(data => data.Value);

        foreach (string initializedValue in initializedValues)
        {
            yield return initializedValue;
        }
    }
}

Because the yield is inside the lock, the lock remains active, until you dispose the enumerator:

List<string> someInitializedValues = GetInitializedValues()
   .Take(3)
   .ToList();

This one is save, and only enumerates the first three elements.

Deep inside it will do something like this:

List<string> someInitializedValues = new List<string>();
IEnumerable<string> enumerableInitializedValues = GetInitializedValues();
// MyList is not locked yet!

// Create the enumerator. This is Disposable, so use using statement:
using (IEnumerator<string> initializedValuesEnumerator = enumerableInitializedValues.GetEnumerator())
{
    // start enumerating the first 3 elements (remember: we used Take(3)
    while (initializedValuesEnumerator.MoveNext() && someInitializedValues.Count < 3)
    {
        // there is a next element, fetch it and add the fetched value to the list
        string fetchedInitializedValue = initializedValuesEnumerator.Current;
        someInitializedValues.Add(fetchedInitializedValue);
    }

    // the enumerator is not disposed yet, MyList is still locked.
}
// the enumerator is disposed. MyList is not locked anymore
Harald Coppoolse
  • 28,834
  • 7
  • 67
  • 116
  • Brilliant use of the `yield return` inside the `lock`! – Enigmativity Jan 09 '23 at 21:58
  • @Enigmativity you are kidding, right? *“I note that it is also a "worst practice" to do a yield return inside a lock, for the same reason. It is legal to do so, but I wish we had made it illegal. We're not going to make the same mistake for "await".”* [Eric Lippert Sep 30, 2011](https://stackoverflow.com/questions/7612602/why-cant-i-use-the-await-operator-within-the-body-of-a-lock-statement/7612714#7612714) – Theodor Zoulias Jan 10 '23 at 04:42
  • 1
    @TheodorZoulias - Interesting. I haven't seen that before. Time to try figure out how to break that! – Enigmativity Jan 10 '23 at 06:20
  • 1
    @Enigmativity just do `GetInitializedValues().GetEnumerator().MoveNext();` and voilà: to lock is leaked. The next attempt to enumerate the `GetInitializedValues()` from another thread will deadlock. – Theodor Zoulias Jan 10 '23 at 06:38