16

I'm building a simple LinQ-to-object query which I'd like to parallelize, however I'm wondering if the order of statements matter ?

e.g.

IList<RepeaterItem> items;

var result = items
        .Select(item => item.FindControl("somecontrol"))
        .Where(ctrl => SomeCheck(ctrl))
        .AsParallel();

vs.

var result = items
        .AsParallel()
        .Select(item => item.FindControl("somecontrol"))
        .Where(ctrl => SomeCheck(ctrl));

Would there be any difference ?

Steffen
  • 13,648
  • 7
  • 57
  • 67

2 Answers2

24

Absolutely. In the first case, the projection and filtering will be done in series, and only then will anything be parallelized.

In the second case, both the projection and filtering will happen in parallel.

Unless you have a particular reason to use the first version (e.g. the projection has thread affinity, or some other oddness) you should use the second.

EDIT: Here's some test code. Flawed as many benchmarks are, but the results are reasonably conclusive:

using System;
using System.Diagnostics;
using System.Linq;
using System.Threading;

class Test
{
    static void Main()
    {
        var query = Enumerable.Range(0, 1000)
                              .Select(SlowProjection)
                              .Where(x => x > 10)
                              .AsParallel();
        Stopwatch sw = Stopwatch.StartNew();
        int count = query.Count();
        sw.Stop();
        Console.WriteLine("Count: {0} in {1}ms", count,
                          sw.ElapsedMilliseconds);

        query = Enumerable.Range(0, 1000)
                          .AsParallel()
                          .Select(SlowProjection)
                          .Where(x => x > 10);
        sw = Stopwatch.StartNew();
        count = query.Count();
        sw.Stop();
        Console.WriteLine("Count: {0} in {1}ms", count,
                          sw.ElapsedMilliseconds);
    }

    static int SlowProjection(int input)
    {
        Thread.Sleep(100);
        return input;
    }
}

Results:

Count: 989 in 100183ms
Count: 989 in 13626ms

Now there's a lot of heuristic stuff going on in PFX, but it's pretty obvious that the first result hasn't been parallelized at all, whereas the second has.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
2

It does matter and not just in performance. The result of the first and the second queries are not equal. There is solution to have parallel processing and keeping the original order. Use AsParallel().AsOrdered(). Third query shows it.

var SlowProjection = new Func<int, int>((input) => { Thread.Sleep(100); return input; });

var Measure = new Action<string, Func<List<int>>>((title, measure) =>
{
    Stopwatch sw = Stopwatch.StartNew();
    var result = measure();
    sw.Stop();
    Console.Write("{0} Time: {1}, Result: ", title, sw.ElapsedMilliseconds);
    foreach (var entry in result) Console.Write(entry + " ");         
});

Measure("Sequential", () => Enumerable.Range(0, 30)
    .Select(SlowProjection).Where(x => x > 10).ToList());
Measure("Parallel", () => Enumerable.Range(0, 30).AsParallel()
    .Select(SlowProjection).Where(x => x > 10).ToList());
Measure("Ordered", () => Enumerable.Range(0, 30).AsParallel().AsOrdered()
    .Select(SlowProjection).Where(x => x > 10).ToList());

Result:

Sequential Time: 6699, Result: 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
Parallel Time: 1462, Result: 12 16 22 25 29 14 17 21 24 11 15 18 23 26 13 19 20 27 28
Ordered Time: 1357, Result: 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

I was surprised about that, but the result was consistent after 10+ test run. I investigated a bit and it turned out to be a "bug" in .Net 4.0. In 4.5 AsParallel() is not slower than AsParallel().AsOrdered()

Reference is here:

http://msdn.microsoft.com/en-us/library/dd460677(v=vs.110).aspx

Kobor42
  • 5,129
  • 1
  • 17
  • 22
  • Can you speak to why .AsOrdered() makes it faster, or is that just a fluke from running it once? I would think maintaining the order would increase execution speed, not decrease it. – Phillip Copley Feb 26 '15 at 16:47
  • 1
    I was surprised too but the result was consistent across multiple (10+) test runs. I don't know why and I didn't investigate - it wasn't related to the question. :-) – Kobor42 Feb 26 '15 at 17:15