72

I'm calling a slow webservice in parallel. Things were great until I realized I need to get some information back from the service. But I don't see where to get the values back. I can't write to the database, HttpContext.Current appears to be null inside of a method called using Parallel.ForEach

Below is a sample program (in your mind, please imagine a slow web service instead of a string concatenation)

using System;
using System.Threading.Tasks;

class Program
{
    static void Main(string[] args)
    {
        WordMaker m = new WordMaker();
        m.MakeIt();
    }
    public class WordMaker
    {
        public void MakeIt()
        {
            string[] words = { "ack", "ook" };
            ParallelLoopResult result = Parallel.ForEach(words, word => AddB(word));
            Console.WriteLine("Where did my results go?");
            Console.ReadKey();
        }
        public string AddB(string word)
        {
            return "b" + word;
        }
    }

}
MatthewMartin
  • 32,326
  • 33
  • 105
  • 164

6 Answers6

80

You've discarded it in here.

ParallelLoopResult result = Parallel.ForEach(words, word => AddB(word));

You probably want something like,

ParallelLoopResult result = Parallel.ForEach(words, word =>
{
    string result = AddB(word);
    // do something with result
});

If you want some sort of collection at the end of this, consider using one of the collections under System.Collections.Concurrent, like ConcurrentBag

ConcurrentBag<string> resultCollection = new ConcurrentBag<string>();
ParallelLoopResult result = Parallel.ForEach(words, word =>
{
    resultCollection.Add(AddB(word));
});

// Do something with the result
Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
M Afifi
  • 4,645
  • 2
  • 28
  • 48
  • 47
    I think the ParallelLoopResult does nothing useful here. +1 though – usr Sep 27 '12 at 07:41
  • 3
    .AsParallel() in LINQ would be a lot better – Jorge Aguirre Jan 29 '17 at 00:46
  • 3
    @eranotzap That's the issue that is solved by using a ConcurrentBag instead of a List. – Russell Horwood Mar 17 '17 at 16:03
  • @NineTails so it is an issue. only adding not adding and removing. Is only adding to a list a concurrency issue. maybe it's done in an atomic way. meaning that the index in the backing array might be incremented atomically – eran otzap Mar 17 '17 at 18:34
  • @PatrickfromNDependteam your latest edit ([revision 6](https://stackoverflow.com/revisions/12610915/6)) deviates a lot from the original answer, to a point that invalidates the voting system of the site. I would suggest to post your contribution as a separate answer instead. – Theodor Zoulias Jan 26 '22 at 06:23
  • @TheodorZoulias most users will take on the validated solution with an expensive ConcurrentBag<> without realizing that a cheap array is enough to solve their problem. Better update the validated solution with the most efficient solution, assuming the collection size is known upfront which is often the case in practice. – Patrick from NDepend team Jan 26 '22 at 06:47
  • @PatrickfromNDependteam right or wrong, 73 users have upvoted M Afifi's answer because of the content that was posted 10+ years ago. They have not voted it for your contribution, and it's not obvious that your idea of using an array without synchronization is safe and correct. Why don't you post your idea as a separate answer, so that it can be evaluated/voted independently for its content? Related meta post: [Edit to "add more useful answer"](https://meta.stackoverflow.com/questions/291906/edit-to-add-more-useful-answer). – Theodor Zoulias Jan 26 '22 at 07:00
  • @TheodorZoulias ok I posted my own answer but it means that in the future many users won't use the optimal solution for their problems. – Patrick from NDepend team Jan 26 '22 at 07:07
  • @PatrickfromNDependteam IMHO an even better solution for this problem would be to ditch the `Parallel` class, and use PLINQ instead: `var results = words.AsParallel().AsOrdered().Select(w => AddB(w)).ToArray();` Do you think that I should post my idea as a separate answer, or insert it at the bottom of the top voted answer? – Theodor Zoulias Jan 26 '22 at 07:17
  • @TheodorZoulias to validate this solution one has to read documentation of these extension methods and maybe even decompile them to verify no lock are used to make sure it is as optimal as a solution with a raw array and no lock. Btw it has been proposed below with a "I am guessing no lock is involved" note. KISS principle. – Patrick from NDepend team Jan 26 '22 at 07:36
41

Your may consider using AsParallel extension method of IEnumerable, it will take care of the concurrency for you and collect the results.

words.AsParallel().Select(AddB).ToArray()

Synchronisation (e.g. locks or concurrent collections that use locks) are usually bottleneck of concurrent algorithms. The best is to avoid synchronisation as much as possible. I am guessing that AsParallel uses something smarter like putting all the items produced on single thread into a local non-concurrent collection and then combining these at the end.

Steves
  • 2,798
  • 21
  • 21
14

Do not use ConcurrentBag to collect results as it is slower. Use local lock instead.

var resultCollection = new List<string>();
object localLockObject = new object();

Parallel.ForEach<string, List<string>>(
      words,
      () => { return new List<string>(); },
      (word, state, localList) =>
      {
         localList.Add(AddB(word));
         return localList;
      },
      (finalResult) => { lock (localLockObject) resultCollection.AddRange(finalResult); }
); 

// Do something with resultCollection here
Minh Nguyen
  • 2,106
  • 1
  • 28
  • 34
  • 1
    Do you have any stats to show that ConcurrentBag is slower than using our own object lock? I just want to know how slow it is, since it makes my code look cleaner than using object lock. – dinesh ygv Oct 23 '15 at 05:55
  • @dineshygv IMHO Difference is negligible http://stackoverflow.com/questions/2950955/concurrentbagof-mytype-vs-listof-mytype/34016915#34016915 – Matas Vaitkevicius Dec 01 '15 at 10:27
  • Or don't use any locking at all ;-) – Steves Jun 05 '17 at 14:40
6

This seems safe, fast, and simple:

    public string[] MakeIt() {
        string[] words = { "ack", "ook" };
        string[] results = new string[words.Length];
        ParallelLoopResult result =
            Parallel.For(0, words.Length, i => results[i] = AddB(words[i]));
        return results;
    }
Overlord Zurg
  • 3,430
  • 2
  • 22
  • 27
5

In the particular cases where the size of the collection is know upfront - which is often the case in practice - an array can be use instead of an expensive concurrent collection. There is no risk of collision since each loop accesses its own slot in the ouputs array. As a bonus outputs are stored with the same order as inputs:

const int NB_WORDS = 1000;
var inputs = new string[NB_WORDS];
for(var i= 0; i < NB_WORDS; i++) { inputs[i] = i.ToString(); }

var outputs = new string[NB_WORDS];

Parallel.For(0, NB_WORDS, index => {
   string word = inputs[index];
   string result = word + word; // Operation on word
   outputs[index] = result; // No need of a concurrent collection to store the result!
});

Debug.Assert(outputs.All(result => !string.IsNullOrEmpty(result)));
Patrick from NDepend team
  • 13,237
  • 6
  • 61
  • 92
3

How about something like this:

public class WordContainer
{
    public WordContainer(string word)
    {
        Word = word;
    }

    public string Word { get; private set; }
    public string Result { get; set; }
}

public class WordMaker
{
    public void MakeIt()
    {
        string[] words = { "ack", "ook" };
        List<WordContainer> containers = words.Select(w => new WordContainer(w)).ToList();

        Parallel.ForEach(containers, AddB);

        //containers.ForEach(c => Console.WriteLine(c.Result));
        foreach (var container in containers)
        {
            Console.WriteLine(container.Result);
        }

        Console.ReadKey();
    }

    public void AddB(WordContainer container)
    {
        container.Result = "b" + container.Word;
    }
}

I believe the locking or concurrent objects isn't necessary unless you need the results to interact with one another (like you were computing a sum or combining all the words). In this case ForEach neatly breaks your original list up and hands each thread its own object that it can manipulate all it wants without worrying about interfering with the other threads.

MichaC
  • 2,834
  • 2
  • 20
  • 20
  • Yes, that would work for Console apps, but event for Console apps you might want to aggregate them first in a collection, or else you get interleaved results to the Console window. – MatthewMartin Apr 03 '15 at 02:00
  • The Console.WriteLine commands are running synchronously on the main thread and it will print the results in the order they were defined in the original List after Parallel.ForEach finishes processing all the list items and returns. If I were calling WriteLine from within the Parallel.ForEach then yes the results would be interleaved. – MichaC Apr 04 '15 at 00:53