0

Consider the following code:

ConcurrentBag<string> results = new ConcurrentBag<string>();
Parallel.ForEach(stringArray, myString =>
{
    if(myString.Contains("hello")){
        results.Add(myString);
    }
});
Console.WriteLine(String.Concat(results));

where stringArray is an array of hundreds of thousands of strings and 20 of them contain the word "hello" (they are evenly distributed). No matter how many times I execute the code, the output is always the same. I expected the various outputs to be somewhat different from one another as the order of execution of each ForEach is not guaranteed, so why am I obtaining always the same result? Is this normal? I'm wondering if the execution is really split in multiple threads.

For context I am on Windows 10 with a 8-cores, 16-threads CPU. I tried to increase MaxDegreeOfParallelism still without any difference.

Guru Stron
  • 102,774
  • 10
  • 95
  • 132
Alessandro
  • 97
  • 6
  • put them in an async function and add a random timer in it to add some spice ;) – ilkerkaran May 18 '23 at 14:33
  • I'm confused. The `results` are read after the `Parallel.ForEach` is complete. Why wouldn't the results be consistent? I'd be more concerned if they were not consistent... If you were reading `results` *during* the `ForEach`, yes, I'd expect some variation between runs. – Heretic Monkey May 18 '23 at 14:37
  • 4
    Just FYI on [`ConcurrentBag` usage](https://stackoverflow.com/a/64823123/2501279) – Guru Stron May 18 '23 at 14:42
  • 2
    @HereticMonkey, the results are read from the collection after the `ForEach`, but they're added to that collection during the `ForEach`. Presumably the OP is expecting them to be added in slightly different order on different occasions – jmcilhinney May 18 '23 at 14:42
  • 1
    @HereticMonkey it seems that OP is concerned about order of the results. – Guru Stron May 18 '23 at 14:43
  • I expect that, because the work being done is so simple, that there just isn't the opportunity to create variation. There would have to be some randomness to the overall operation for things to be different and maybe there just can't be when what's being done is so basic. – jmcilhinney May 18 '23 at 14:45
  • If you have 20 instances 'evenly distributed' within hundreds of thousands of strings, it's not surprising that the results are consistent as each instance is perhaps 10,000 lines from the next. – stuartd May 18 '23 at 14:46
  • 7
    Just because it's not guaranteed to be in order, doesn't mean it's guaranetee to *not* be in order. – Charlieface May 18 '23 at 14:46

1 Answers1

2

is an array of hundreds of thousands of strings and 20 of them contain the word "hello"

Нou have a lot of strings and only small amount of them match the criteria, so unless those matching it are placed near each other it is highly expected that you will get strings in the same order.

So imagine that you are processing the data in several threads (and your handler is quite fast and should not have that much variance in execution time), the input data will still be supplied in some order dependent on the initial array order, so to have variance in the result order you need to have several matching values to be close enough and "earlier" one sometimes to be processed (much) longer than the later one.

For example something like:

ConcurrentBag<string> results = new ConcurrentBag<string>(); //or better use List with locking on addition
int mark = 0;
Parallel.ForEach(stringArray, myString =>
{
    if(myString.Contains("hello"))
    {
        if (Interlocked.Increment(ref mark) == 0)
        {
            Thread.Sleep(50_000); // wait for 50 seconds for the first string processing
        }
        results.Add(myString);
    }
});

Should in theory produce different order (50 seconds is some arbitrary number which you may need to correct based on the actual "distance" between first and second matching strings and the handler processing time).

P.S.

To quote @Charlieface from the comments:

Just because it's not guaranteed to be in order, doesn't mean it's guaranteed to not be in order.

Guru Stron
  • 102,774
  • 10
  • 95
  • 132