1

I have a ConcurrentBag collection, and for every item in this, i do a task for do some work.

I know that Parallel.ForEach with ConcurrentBag is thread safe, but i never done parallelism in this way.

My code looks like this

public void Work(IList<MyModel> data)
{
   var tasks = new ConcurrentBag<Task>();
   var safeData = new ConcurrentBag<MyModel>(data);
   foreach (var item in safeData )
   {
      var task = Task.Factory.StartNew(() => SomeTask(item));
      tasks.Add(task);
   }
   Task.WaitAll(tasks.ToArray());
}

So, this code is thread safe?

Thanks

EDIT: Maybe this question could be: "item" will be thread safe or could change his value between iterations?

EDIT 2: I refer a if this code could fall in this problem.

Edit 3: In this question, "Wonko the sane" is having a problem when he works directly with a loop variable. I thought that this problem could be corrected using ConcurrentBag, but in some answers, tell me that i don't need ConcurrentBag at all. So:

Is a good idea use directly a loop variable?

When is recomended copy the value to another variable?

Is safe to work directly with a loop variable from ConcurrentBag?

Community
  • 1
  • 1
Vladimir Venegas
  • 3,894
  • 5
  • 25
  • 45
  • Which part of this single threaded code you expect to have problems with? (Ignoring part that runs `SomeTask(item)` on separate threads which presumably thread safe). – Alexei Levenkov Jun 25 '16 at 03:33
  • Are you asking about closure? Can you provide an explanation of your second wording of question? – Kote Jun 25 '16 at 04:15
  • 1
    The tasks and safeData collections are used by only *one* thread. So always thread-safe and no point in using a ConcurrentBag. The threads party on the data elements, there is no way to tell from the snippet whether those objects are used in a thread-safe way. Tends to work out well since each thread munches on a distinct element but trouble starts when the elements have an object reference to the same object and a worker thread modifies it. – Hans Passant Jun 25 '16 at 10:10

2 Answers2

2

You don't need ConcurrentBag for this code at all.... And you don't even need ConcurrentBag for Parallel.ForEach. Argument to Parallel.ForEach does not need to be thread safe.

Instead you could simply write,

public void Work(IList<MyModel> data)
{
   Parallel.ForEach(data, item=>{
       DoSomeTask(item);
   });
}

alternatively...

public void Work(IList<MyModel> data)
{
   Task.WaitAll(data.Select(item=>{
       DoSomeTask(item);
   }));
}

You need ConcurrentBag only if you are modifying it within the code that executes inside DoSomeTask.

Each item is an isolated instance for each task, unless you have duplicate items in the collection. Even if you have duplicate items in the input list, ConcurrentBag will not solve anything.

Akash Kava
  • 39,066
  • 20
  • 121
  • 167
1

In addition to existing answer:

The problem isn't about multithreading or concurrency. It's about closures.

  1. It's ok to use directly loop variable if you understand what's going on.
  2. In case of foreach it depends on your compiler version. If code will be compiled with older versions it will lead to problem, described in that very question. If you use for loop, you should always copy loop variable. For more details you can see links in Jon's answer.
  3. As already mentioned using ConcurrentBag will not affect the result.
    It's safe to close over loop variable if you have older version of compiler than 4.0.30319.1 (in case of MS compiler). More info about compiler versions.

Also as mentioned in comments to that question, you can use static analysis tools which will highlight such cases.

Community
  • 1
  • 1
Kote
  • 2,116
  • 1
  • 19
  • 20