54

I am making a first attempt at playing with the new Tasks, but something is happening that I don't understand.

First, the code, which is pretty straight-forward. I pass in a list of paths to some image files, and attempt to add a task to process each of them:

public Boolean AddPictures(IList<string> paths)
{
    Boolean result = (paths.Count > 0);
    List<Task> tasks = new List<Task>(paths.Count);

    foreach (string path in paths)
    {
        var task = Task.Factory.StartNew(() =>
            {
                Boolean taskResult = ProcessPicture(path);
                return taskResult;
            });
        task.ContinueWith(t => result &= t.Result);
        tasks.Add(task);
    }

    Task.WaitAll(tasks.ToArray());

    return result;
}

I've found that if I just let this run with, say, a list of 3 paths in a unit test, all three tasks use the last path in the provided list. If I step through (and slow down the processing of the loop), each path from the loop is used.

Can somebody please explain what is happening, and why? Possible workarounds?

abatishchev
  • 98,240
  • 88
  • 296
  • 433
Wonko the Sane
  • 10,623
  • 8
  • 67
  • 92
  • 3
    May I suggest using ReSharper. This particular error and other potential bugs are highlighten for you – Rune FS Jan 13 '11 at 19:53

2 Answers2

100

You're closing over the loop variable. Don't do that. Take a copy instead:

foreach (string path in paths)
{
    string pathCopy = path;
    var task = Task.Factory.StartNew(() =>
        {
            Boolean taskResult = ProcessPicture(pathCopy);
            return taskResult;
        });
    // See note at end of post
    task.ContinueWith(t => result &= t.Result);
    tasks.Add(task);
}

Your current code is capturing path - not the value of it when you create the task, but the variable itself. That variable changes value each time you go through the loop - so it can easily change by the time your delegate is called.

By taking a copy of the variable, you're introducing a new variable each time you go through the loop - when you capture that variable, it won't be changed in the next iteration of the loop.

Eric Lippert has a pair of blog posts which go into this in a lot more detail: part 1; part 2.

Don't feel bad - this catches almost everyone out :(


Note about this line:

task.ContinueWith(t => result &= t.Result);

As pointed out in comments, this isn't thread-safe. Multiple threads could execute it at the same time, potentially stamping on each other's results. I haven't added locking or anything similar as it would distract from the main issue that the question is interested, namely variable capture. However, it's worth being aware of.

formicini
  • 298
  • 4
  • 16
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • 1
    But of course. Forest for the trees and all that. :) – Wonko the Sane Jan 13 '11 at 19:32
  • 1
    this closure problem and in-proper use of Random() must be in the top 5 frequency-wise at SO – BrokenGlass Jan 13 '11 at 19:35
  • Please note that this "bug" (which was originally *by design*) is supposed to be fixed in C#5.0 – Louis Kottmann Apr 25 '12 at 14:32
  • 2
    This behaviour has changed, not only in C#5.0 (as noted in the update on Eric Lipperts blog article), but also in VS2012 if you're targeting 4.0. – Snixtor Mar 01 '13 at 00:42
  • 5
    @Snixtor: That's still the C# 5 compiler. It's important to distinguish between the *language* version you're using, and the *framework* version you're targeting. – Jon Skeet Mar 01 '13 at 06:42
  • What happens if, for example, the iterating list is a List ? – Francesco Bonizzi Feb 18 '16 at 15:47
  • 1
    @FrancescoB.: The same thing - in C# 4, the variable would refer to "the last item referred to". – Jon Skeet Feb 18 '16 at 16:02
  • 2
    This code has a race condition in that you're aggregating the results into the `result` variable from multiple threads without proper synchronization. – Servy Nov 14 '18 at 15:49
  • @Servy: Thanks - I've added a note about this. Let me know if you think it's appropriate for the context. – Jon Skeet Nov 14 '18 at 23:20
  • Yeah, the note is fine. I only bothered to mention it because someone posted a question today in which they were just copying the code here. I'm aware that the issue was in the original question, and that it's not related to what was being asked about. – Servy Nov 14 '18 at 23:26
  • That `copy` note saved my life! it's so simple (_and maybe stupid at first_) but I could not figure that out by myself, even if I thought for 100 years! – Mehdi Dehghani Apr 14 '19 at 10:37
  • 1
    @MehdiDehghani: Note that this was back in C# 4 days. C# 5 fixed the way that `foreach` variables are handled in closures, so *most* of the time you don't need to think about this copying. (You do for a `for` variable though.) – Jon Skeet Apr 14 '19 at 16:44
  • I did this cache thing before reading any answers and it worked, but still was afraid of how I did solve it myself. Thanks for the ans ;) – Deepak Jan 15 '21 at 18:45
12

The lambda that you're passing to StartNew is referencing the path variable, which changes on each iteration (i.e. your lambda is making use of the reference of path, rather than just its value). You can create a local copy of it so that you aren't pointing to a version that will change:

foreach (string path in paths)
{
    var lambdaPath = path;
    var task = Task.Factory.StartNew(() =>
        {
            Boolean taskResult = ProcessPicture(lambdaPath);
            return taskResult;
        });
    task.ContinueWith(t => result &= t.Result);
    tasks.Add(task);
}
bdukes
  • 152,002
  • 23
  • 148
  • 175