145

I have a class, like this:

public class MyClass
{
    public int Value { get; set; }
    public bool IsValid { get; set; }
}

In actual fact it's much larger, but this recreates the problem (weirdness).

I want to get the sum of the Value, where the instance is valid. So far, I've found two solutions to this.

The first one is this:

int result = myCollection.Where(mc => mc.IsValid).Select(mc => mc.Value).Sum();

The second one, however, is this:

int result = myCollection.Select(mc => mc.IsValid ? mc.Value : 0).Sum();

I want to get the most efficient method. I, at first, thought that the second one would be more efficient. Then the theoretical part of me started going "Well, one is O(n + m + m), the other one is O(n + n). The first one should perform better with more invalids, while the second one should perform better with less". I thought that they would perform equally. EDIT: And then @Martin pointed out that the Where and the Select were combined, so it should actually be O(m + n). However, if you look below, it seems like this is not related.


So I put it to the test.

(It's 100+ lines, so I thought it was better to post it as a Gist.)
The results were... interesting.

With 0% tie tolerance:

The scales are in the favour of Select and Where, by about ~30 points.

How much do you want to be the disambiguation percentage?
0
Starting benchmarking.
Ties: 0
Where + Select: 65
Select: 36

With 2% tie tolerance:

It's the same, except that for some they were within 2%. I'd say that's a minimum margin of error. Select and Where now have just a ~20 point lead.

How much do you want to be the disambiguation percentage?
2
Starting benchmarking.
Ties: 6
Where + Select: 58
Select: 37

With 5% tie tolerance:

This is what I'd say to be my maximum margin of error. It makes it a bit better for the Select, but not much.

How much do you want to be the disambiguation percentage?
5
Starting benchmarking.
Ties: 17
Where + Select: 53
Select: 31

With 10% tie tolerance:

This is way out of my margin of error, but I'm still interested in the result. Because it gives the Select and Where the twenty point lead it's had for a while now.

How much do you want to be the disambiguation percentage?
10
Starting benchmarking.
Ties: 36
Where + Select: 44
Select: 21

With 25% tie tolerance:

This is way, way out of my margin of error, but I'm still interested in the result, because the Select and Where still (nearly) keep their 20 point lead. It seems like it's outclassing it in a distinct few, and that's what giving it the lead.

How much do you want to be the disambiguation percentage?
25
Starting benchmarking.
Ties: 85
Where + Select: 16
Select: 0


Now, I'm guessing that the 20 point lead came from the middle, where they're both bound to get around the same performance. I could try and log it, but it would be a whole load of information to take in. A graph would be better, I guess.

So that's what I did.

Select vs Select and Where.

It shows that the Select line keeps steady (expected) and that the Select + Where line climbs up (expected). However, what puzzles me is why it doesn't meet with the Select at 50 or earlier: in fact I was expecting earlier than 50, as an extra enumerator had to be created for the Select and Where. I mean, this shows the 20-point lead, but it doesn't explain why. This, I guess, is the main point of my question.

Why does it behave like this? Should I trust it? If not, should I use the other one or this one?


As @KingKong mentioned in the comments, you can also use Sum's overload that takes a lambda. So my two options are now changed to this:

First:

int result = myCollection.Where(mc => mc.IsValid).Sum(mc => mc.Value);

Second:

int result = myCollection.Sum(mc => mc.IsValid ? mc.Value : 0);

I'm going to make it a bit shorter, but:

How much do you want to be the disambiguation percentage?
0
Starting benchmarking.
Ties: 0
Where: 60
Sum: 41
How much do you want to be the disambiguation percentage?
2
Starting benchmarking.
Ties: 8
Where: 55
Sum: 38
How much do you want to be the disambiguation percentage?
5
Starting benchmarking.
Ties: 21
Where: 49
Sum: 31
How much do you want to be the disambiguation percentage?
10
Starting benchmarking.
Ties: 39
Where: 41
Sum: 21
How much do you want to be the disambiguation percentage?
25
Starting benchmarking.
Ties: 85
Where: 16
Sum: 0

The twenty-point lead is still there, meaning it doesn't have to do with the Where and Select combination pointed out by @Marcin in the comments.

Thanks for reading through my wall of text! Also, if you're interested, here's the modified version that logs the CSV that Excel takes in.

Michael Schmidt
  • 9,090
  • 13
  • 56
  • 80
It'sNotALie.
  • 22,289
  • 12
  • 68
  • 103
  • You might not test the case `int result = myCollection.Where(mc => mc.IsValid).Sum(mc=>mc.Value)`? – King King Aug 20 '13 at 09:46
  • @KingKing Oh, true. That could also apply to the Select, I guess. I'll add those scores in. – It'sNotALie. Aug 20 '13 at 09:46
  • 1
    I'd say it depends on how expensive the sum and access to `mc.Value` are. – Medinoc Aug 20 '13 at 09:48
  • 14
    @It'sNotALie. `Where`+`Select` does not cause two separated iterations over input collection. LINQ to Objects optimize it into one iteration. Read More on my [blog post](http://mjuraszek.blogspot.com/2013/08/did-you-know-about-that-linq-feature-i.html) – MarcinJuraszek Aug 20 '13 at 09:48
  • @MarcinJuraszek Wow, that's really interesting. I did **not** know that. – It'sNotALie. Aug 20 '13 at 09:50
  • @Marcin: He never said that. He was talking about the extra *enumerator*. And he is correct. There are three enumerators in use: The one from the original collection, the one from `Where` that uses the enumerator from the original collection and finally the one from `Select` which uses the enumerator from `Where`. – Daniel Hilgarth Aug 20 '13 at 09:50
  • @Medinoc They're the same in this example, though. – It'sNotALie. Aug 20 '13 at 09:52
  • 1
    @DanielHilgarth He said, that `Where`+`Select` is *O(n+n+m)*, but it's really *O(n+m)*, because `Where`+`Select` are combined and executed together, with just one iteration over source collection. – MarcinJuraszek Aug 20 '13 at 09:52
  • @Marcin: Ah, that's what you were referring to. You are correct, there he implicitly said that it is iterated multiple times. I must have overlooked that. – Daniel Hilgarth Aug 20 '13 at 09:53
  • @MarcinJuraszek It still doesn't have to do with it, though. See my update using the Sum overloads taking a lambda. – It'sNotALie. Aug 20 '13 at 09:57
  • @It'sNotALie It's not a question of whether the expensiveness of `mc.Value` changes. It's a question of whether it's expensive enough to offset the cost of adding the `Where` (which probably adds a layer of indirection). – Medinoc Aug 20 '13 at 09:58
  • @Medinoc Oh, fair enough. – It'sNotALie. Aug 20 '13 at 09:59
  • @MarcinJuraszek Does this optimization happen because of the Deferred execution of LINQ, or amb I mixing concepts? – margabit Aug 20 '13 at 09:59
  • @margabit It's not really connected. You can implement deffered execution without that kind of optimization. – MarcinJuraszek Aug 20 '13 at 10:05
  • Look this : http://www.codeproject.com/Articles/38174/How-to-improve-your-LINQ-query-performance-by-5-X – Ramesh Rajendran Aug 20 '13 at 10:06
  • @RameshRajendran That has to do with LINQ to SQL, instead of LINQ to Objects. – It'sNotALie. Aug 20 '13 at 10:07
  • http://codebetter.com/stevehebert/2008/02/06/linq-to-objects-relating-data-structure-organization-to-where-clause-optimization/ – Ramesh Rajendran Aug 20 '13 at 10:12
  • 1
    @RameshRajendran That is not a fair comparison, because a) we're not talking about dictionaries, and b) [LINQ's performance is really competent, if you use it right.](http://stackoverflow.com/questions/4044400/linq-performance-faq) – It'sNotALie. Aug 20 '13 at 10:16
  • 1
    @margabit It is connected though, as it can't really be done without deferred execution. – It'sNotALie. Aug 20 '13 at 10:38
  • 4
    Interesting. Let me just point out that a for loop over an array would be 10x faster than the best LINQ solution. So if you go hunting for perf, don't use LINQ in the first place. – usr Aug 20 '13 at 11:00
  • 2
    Sometimes people ask after real research, this is one example question: I am not C# user came from Hot-question-list. – Grijesh Chauhan Aug 20 '13 at 11:35
  • @usr For in this case is only 3-5x faster. Significant, yes, but in this case I'm asking more about the performance of the two LINQ options rather than the absolute fastest. I was clear for was the fastest :) – It'sNotALie. Aug 20 '13 at 12:33
  • @It'sNotALie. Did you see my answer? I did some tests and come up with different (more *natural*) results. – MarcinJuraszek Aug 20 '13 at 12:37
  • 1
    I'm not sure why this question has so many upvotes - besides being rather basic, more importantly it's irrelevant. If this is really a hotspot in your application, not using LINQ at all will certainly be faster *(delegates and enumerators are expensive, relative to not having them at least)*. And if this is not a hotspot, then who cares which is faster? – BlueRaja - Danny Pflughoeft Aug 20 '13 at 14:22
  • @BlueRaja-DannyPflughoeft It's not rather basic; if it is how can you write **so much stuff about it**? – It'sNotALie. Aug 20 '13 at 17:40
  • You could write an entire book on elementary-school arithmetic (and it takes four years to teach); that doesn't mean it's not considered 'basic.' The fact that LINQ uses deferred execution is one of the most elementary facts about it. – BlueRaja - Danny Pflughoeft Aug 20 '13 at 18:19
  • @BlueRaja-DannyPflughoeft Sure, but **this doesn't have anything to do with deferred execution**. – It'sNotALie. Aug 20 '13 at 18:20
  • Sure it does; but now I realize why that might not be obvious (sorry!). Since a lot of people (including the top answer) seem to be confused about this, I'll write out an answer. – BlueRaja - Danny Pflughoeft Aug 20 '13 at 18:38
  • @BlueRaja-DannyPflughoeft Please do! – It'sNotALie. Aug 20 '13 at 18:59
  • Which platform are you running this on? I can't get the source code to match up with your behavior. – John Tseng Aug 20 '13 at 21:24
  • @JohnTseng Windows 8, .NET 4.5.1, Release w/ no debugger. What do you get? I'm interested. – It'sNotALie. Aug 20 '13 at 21:27
  • I only looked at the source for 4.0 on typedescriptor. I'm downloading the source for 4.5 now to see if there's a difference. – John Tseng Aug 20 '13 at 21:54
  • @JohnTseng No need. [Decompiled 4.5.1 source for Enumerable.](http://pastebin.com/fbBi1dvY). – It'sNotALie. Aug 20 '13 at 21:59
  • Thanks. The source looks the same. I've updated my answer to explain the differences between the results. – John Tseng Aug 20 '13 at 22:11
  • This is one of those dumb riddles/paradoxes. The answer is hidden as a false assumption in the question. Why do you expect the performance to meet at exactly 50% valid/invalid? – Aleksandr Dubinsky Aug 20 '13 at 23:05
  • @Aleksandr Because n+m+m is equal to n+n at 50%? – It'sNotALie. Aug 20 '13 at 23:40
  • @It'sNotALie. I don't think it's not *n+m+m* and *n+n* anymore – MarcinJuraszek Aug 21 '13 at 02:01
  • Try looking at the code it generates. A real branch is not the same with conditional move. The former will suffer from branch-prediction fail penalty when the branch is random(50%/50%). The time for latter will be constant. – WiSaGaN Aug 21 '13 at 03:04
  • 2
    @WiSaGaN That's a good point. However, if this is due to branch vs conditional move, we would expect to see the most dramatic difference at 50%/50%. Here, we see the most dramatic differences at the ends, where branching is most predictable. If the Where is a branch, and the ternary is a conditional move, then we would expect the Where times to come back down when all the elements are valid, but it never comes back down. – John Tseng Aug 21 '13 at 04:53
  • @It'sNotALie. The formalism O() explicitly excludes coefficients. The actual formula (if we assume O(n+m+m) is an accurate model) is T = a*n + b*m + c*m + d. There is no reason to think that the line will cross when m = 0.5*n. – Aleksandr Dubinsky Aug 21 '13 at 08:54
  • @MarcinJuraszek You have a point. But in paper it was that, and that is why I was expecting it. – It'sNotALie. Aug 21 '13 at 08:55
  • @AleksandrDubinsky My reasoning assumed that it met at 50% because my calculations showed that to be the expected. However, those calculations have been proven wrong, and that's what causes it to be off. However, that's where I assumed the "false" assumption. It is false, because my skills at making O(whatever) or T = whatever are crap. – It'sNotALie. Aug 21 '13 at 08:57
  • @It'sNotALie. You did an impressive job of stimulating interest. If you just put the performance graph first and asked why it doesn't cross at 50, you'd get a few crass responses and noone would care. You managed to weave a web of suspense. Are the "% tie tolerance" statistics something standard, or your invention? – Aleksandr Dubinsky Aug 22 '13 at 10:00
  • @AleksandrDubinsky It's not standard, but I thought it would be helpful in figuring it out. I tried to write a question, people liked it a lot, it got upvoted. No extra things, I just wrote a question that I thought was good. – It'sNotALie. Aug 22 '13 at 10:03
  • @It'sNotALie. But actually it was these unusual, misleading statistics, the sensationalism of the title, and the well-written narrative of your text that turned a rather dull and obvious question into something that appeared in the newsletter. I'm not dissing you. Just observing the interesting thing that happened. – Aleksandr Dubinsky Aug 22 '13 at 10:49
  • @AleksandrDubinsky I would disagree that it's obvious... – It'sNotALie. Aug 22 '13 at 10:50
  • @It'sNotALie. If you rephrased the question as, "function A is _sometimes_ faster that function B, but other times function B is faster. Why isn't function A faster _exactly_ 50.00% of the time?," its obviousness would become obvious to you. It's not about theory. Performance/optimization just doesn't work this way, and everyone knows this. A) It's hard to analytically predict the performance of a function. B) Even if you try really hard, you'll never be exact. – Aleksandr Dubinsky Aug 22 '13 at 11:27
  • @AleksandrDubinsky It's nowhere near 50% for something that **should be** 50%. – It'sNotALie. Aug 22 '13 at 11:31
  • @It'sNotALie. Optimization is _effectively_ voodoo science. Observe that none of the answers actually tried answering why, at 50%, S+W is slightly faster and not slightly slower. Doing so would require analyzing the IL assembly, the x86 assembly, tracking CPU counters, running other experiments, and still not being sure. Heck, people tried to _merely_ replicate the experiment and got wildly different results. Optimization is voodoo science. It's really tough and unpredictable. Please accept this as the answer to your question. – Aleksandr Dubinsky Aug 22 '13 at 11:50

8 Answers8

130

Select iterates once over the entire set and, for each item, performs a conditional branch (checking for validity) and a + operation.

Where+Select creates an iterator that skips invalid elements (doesn't yield them), performing a + only on the valid items.

So, the cost for a Select is:

t(s) = n * ( cost(check valid) + cost(+) )

And for Where+Select:

t(ws) = n * ( cost(check valid) + p(valid) * (cost(yield) + cost(+)) )

Where:

  • p(valid) is the probability that an item in the list is valid.
  • cost(check valid) is the cost of the branch that checks for validity
  • cost(yield) is the cost of constructing the new state of the where iterator, which is more complex than the simple iterator that the Select version uses.

As you can see, for a given n, the Select version is a constant, whereas the Where+Select version is a linear equation with p(valid) as a variable. The actual values of the costs determine the intersection point of the two lines, and since cost(yield) can be different from cost(+), they don't necessarily intersect at p(valid)=0.5.

Alex
  • 7,728
  • 3
  • 35
  • 62
  • 34
    +1 for being the only answer (so far) that actually addresses the question, doesn't _guess_ the answer and doesn't just generate "me too!" statistics. – Binary Worrier Aug 20 '13 at 13:21
  • 4
    Technically LINQ methods creates expressions trees that are run over the whole collection once instead of "sets". – Spoike Aug 20 '13 at 13:36
  • What's `cost(append)`? Really good answer though, looks at it from a different angle rather than just statistics. – It'sNotALie. Aug 20 '13 at 14:16
  • Re: "The intersection point is shifted right because of the extra cost of the `append` operation": I don't think this sentence makes sense. On the OP's graph, "the intersection point is shifted right" means "where+select performs better for a greater range of values" whereas "the extra cost of the `append` operation" is a reason that where+select performs *worse*. I guess you meant to write, "There's an intersection point at <100% because of the extra cost of the `append` operation." – ruakh Aug 20 '13 at 15:26
  • 5
    `Where` does not create anything, just return one element at the time from `source` sequence if only it fill the predicate. – MarcinJuraszek Aug 20 '13 at 16:30
  • 13
    @Spoike - [Expressions trees](http://msdn.microsoft.com/en-us/library/bb397951.aspx) are not relevant here, because this is [linq-to-objects](http://msdn.microsoft.com/en-us/library/bb397919.aspx), not linq-to-something-else (Entity, for example). That's the difference between [`IEnumerable.Select(IEnumerable, Func)`](http://msdn.microsoft.com/en-us/library/bb548891.aspx) and [`IQueryable.Select(IQueryable, Expression)`](http://msdn.microsoft.com/en-us/library/bb534638.aspx). You are right that LINQ does "nothing" until you iterate over the collection, which is probably what you meant. – Kobi Aug 20 '13 at 20:51
  • Well put. Emphasis on the last sentence. OP's fault is simply assuming they'd cross exactly at the center. – Aleksandr Dubinsky Aug 20 '13 at 23:14
33

Here's an in-depth explanation of what's causing the timing-differences.


The Sum() function for IEnumerable<int> looks like this:

public static int Sum(this IEnumerable<int> source)
{
    int sum = 0;
    foreach(int item in source)
    {
        sum += item;
    }
    return sum;
}

In C#, foreach is just syntactic sugar for .Net's version of an iterator, IEnumerator<T> (not to be confused with IEnumerable<T>). So the above code is actually translated to this:

public static int Sum(this IEnumerable<int> source)
{
    int sum = 0;

    IEnumerator<int> iterator = source.GetEnumerator();
    while(iterator.MoveNext())
    {
        int item = iterator.Current;
        sum += item;
    }
    return sum;
}

Remember, the two lines of code you're comparing are the following

int result1 = myCollection.Where(mc => mc.IsValid).Sum(mc => mc.Value);
int result2 = myCollection.Sum(mc => mc.IsValid ? mc.Value : 0);

Now here's the kicker:

LINQ uses deferred execution. Thus, while it may appear that result1 iterates over the collection twice, it actually only iterates over it once. The Where() condition is actually applied during the Sum(), inside of the call to MoveNext() (This is possible thanks to the magic of yield return).

This means that, for result1, the code inside of the while loop,

{
    int item = iterator.Current;
    sum += item;
}

is only executed once for each item with mc.IsValid == true. By comparison, result2 will execute that code for every item in the collection. That is why result1 is generally faster.

(Though, note that calling the Where() condition within MoveNext() still has some small overhead, so if most/all of the items have mc.IsValid == true, result2 will actually be faster!)


Hopefully now it's clear why result2 is usually slower. Now I'd like to explain why I stated in the comments that these LINQ performance comparisons don't matter.

Creating a LINQ expression is cheap. Calling delegate functions is cheap. Allocating and looping over an iterator is cheap. But it's even cheaper to not do these things. Thus, if you find that a LINQ statement is the bottleneck in your program, in my experience rewriting it without LINQ will always make it faster than any of the various LINQ methods.

So, your LINQ workflow should look like this:

  1. Use LINQ everywhere.
  2. Profile.
  3. If the profiler says LINQ is the cause of a bottleneck, rewrite that piece of code without LINQ.

Fortunately, LINQ bottlenecks are rare. Heck, bottlenecks are rare. I've written hundreds of LINQ statements in the last few years, and have ended up replacing <1%. And most of those were due to LINQ2EF's poor SQL optimization, rather than being LINQ's fault.

So, like always, write clear and sensible code first, and wait until after you've profiled to worry about micro-optimizations.

BlueRaja - Danny Pflughoeft
  • 84,206
  • 33
  • 197
  • 283
16

Funny thing. Do you know how is Sum(this IEnumerable<TSource> source, Func<TSource, int> selector) defined? It uses Select method!

public static int Sum<TSource>(this IEnumerable<TSource> source, Func<TSource, int> selector)
{
    return source.Select(selector).Sum();
}

So actually, it all should work nearly the same. I did quick research on my own, and here are the results:

Where -- mod: 1 result: 0, time: 371 ms
WhereSelect -- mod: 1  result: 0, time: 356 ms
Select -- mod: 1  result 0, time: 366 ms
Sum -- mod: 1  result: 0, time: 363 ms
-------------
Where -- mod: 2 result: 4999999, time: 469 ms
WhereSelect -- mod: 2  result: 4999999, time: 429 ms
Select -- mod: 2  result 4999999, time: 362 ms
Sum -- mod: 2  result: 4999999, time: 358 ms
-------------
Where -- mod: 3 result: 9999999, time: 441 ms
WhereSelect -- mod: 3  result: 9999999, time: 452 ms
Select -- mod: 3  result 9999999, time: 371 ms
Sum -- mod: 3  result: 9999999, time: 380 ms
-------------
Where -- mod: 4 result: 7500000, time: 571 ms
WhereSelect -- mod: 4  result: 7500000, time: 501 ms
Select -- mod: 4  result 7500000, time: 406 ms
Sum -- mod: 4  result: 7500000, time: 397 ms
-------------
Where -- mod: 5 result: 7999999, time: 490 ms
WhereSelect -- mod: 5  result: 7999999, time: 477 ms
Select -- mod: 5  result 7999999, time: 397 ms
Sum -- mod: 5  result: 7999999, time: 394 ms
-------------
Where -- mod: 6 result: 9999999, time: 488 ms
WhereSelect -- mod: 6  result: 9999999, time: 480 ms
Select -- mod: 6  result 9999999, time: 391 ms
Sum -- mod: 6  result: 9999999, time: 387 ms
-------------
Where -- mod: 7 result: 8571428, time: 489 ms
WhereSelect -- mod: 7  result: 8571428, time: 486 ms
Select -- mod: 7  result 8571428, time: 384 ms
Sum -- mod: 7  result: 8571428, time: 381 ms
-------------
Where -- mod: 8 result: 8749999, time: 494 ms
WhereSelect -- mod: 8  result: 8749999, time: 488 ms
Select -- mod: 8  result 8749999, time: 386 ms
Sum -- mod: 8  result: 8749999, time: 373 ms
-------------
Where -- mod: 9 result: 9999999, time: 497 ms
WhereSelect -- mod: 9  result: 9999999, time: 494 ms
Select -- mod: 9  result 9999999, time: 386 ms
Sum -- mod: 9  result: 9999999, time: 371 ms

For following implementations:

result = source.Where(x => x.IsValid).Sum(x => x.Value);
result = source.Select(x => x.IsValid ? x.Value : 0).Sum();
result = source.Sum(x => x.IsValid ? x.Value : 0);
result = source.Where(x => x.IsValid).Select(x => x.Value).Sum();

mod means: every 1 from mod items is invalid: for mod == 1 every item is invalid, for mod == 2 odd items are invalid, etc. Collection contains 10000000 items.

enter image description here

And results for collection with 100000000 items:

enter image description here

As you can see, Select and Sum results are quite consistent across all mod values. However where and where+select are not.

MarcinJuraszek
  • 124,003
  • 15
  • 196
  • 263
  • 1
    It's very interesting that in your results, all methods starts at the same place and diverge, whereas the It'sNotALie's results cross in the middle. – John Tseng Aug 20 '13 at 20:41
6

My guess is that the version with Where filters out the 0s and they are not a subject for Sum (i.e. you are not executing the addition). This is of course a guess since I cannot explain how executing additional lambda expression and calling multiple methods outperforms a simple addition of a 0.

A friend of mine suggested that the fact that the 0 in the sum may cause severe performance penalty because of overflow checks. It would be interesting to see how this would perform in unchecked context.

Stilgar
  • 22,354
  • 14
  • 64
  • 101
5

Running the following sample, it becomes clear to me that the only time Where+Select can outperform Select is in fact when it is discarding a good amount (approx half in my informal tests) of the potential items in the list. In the small example below, I get roughly the same numbers out of both samples when the Where skips approx 4mil items out of 10mil. I ran in release, and reordered the execution of where+select vs select with same results.

static void Main(string[] args)
        {
            int total = 10000000;
            Random r = new Random();
            var list = Enumerable.Range(0, total).Select(i => r.Next(0, 5)).ToList();
            for (int i = 0; i < 4000000; i++)
                list[i] = 10;

            var sw = new Stopwatch();
            sw.Start();

            int sum = 0;

            sum = list.Where(i => i < 10).Select(i => i).Sum();            

            sw.Stop();
            Console.WriteLine(sw.ElapsedMilliseconds);

            sw.Reset();
            sw.Start();
            sum = list.Select(i => i).Sum();            

            sw.Stop();

            Console.WriteLine(sw.ElapsedMilliseconds);
        }
DavidN
  • 5,117
  • 2
  • 20
  • 15
  • Mightn't that be because you don't discard the under-tens in the `Select`? – It'sNotALie. Aug 20 '13 at 10:41
  • 3
    Running in debug is useless. – MarcinJuraszek Aug 20 '13 at 10:43
  • 1
    @MarcinJuraszek Obviously. Really meant to say that I ran in release :) – DavidN Aug 20 '13 at 10:46
  • @It'sNotALie That's the point. It seems to me that the only way Where+Select can outperform Select is when Where is filtering out a large amount of the items being summed. – DavidN Aug 20 '13 at 10:47
  • 2
    That's basically what my question's stating. They tie at about 60%, like this sample does. The question is why, which isn't answered here. – It'sNotALie. Aug 20 '13 at 10:57
  • @DavidN Did you run in release in the IDE or by running the executable? – DGibbs Aug 20 '13 at 11:06
  • @DGibbs Ctrl+F5 runs the executable without attaching the debugger. – DavidN Aug 20 '13 at 11:18
  • @It'sNotALie it's obvious that sum is indeed the culprit, once enough elements have been filtered out. Does it not make sense to you that at some point, discarding enough elements with a simple comparison will be cheaper than selecting and summing them up? – DavidN Aug 20 '13 at 11:23
  • I understand your point now. +1, but I'm hoping I can find a fuller explanation. – It'sNotALie. Aug 20 '13 at 12:18
  • @It'sNotALie I'm not sure what's counter-intuitive here. Where+Select IS more expensive than Select, as you would logically expect. Having Where discard enough elements so that the cost of the extra delegate invocation and comparison logic is offset by the sheer amount of leftover items to sum is just a basic tradeoff – DavidN Aug 20 '13 at 13:22
4

If you need speed, just doing a straightforward loop is probably your best bet. And doing for tends to be better than foreach (assuming your collection is random-access of course).

Here are the timings I got with 10% of elements being invalid:

Where + Select + Sum:   257
Select + Sum:           253
foreach:                111
for:                    61

And with 90% invalid elements:

Where + Select + Sum:   177
Select + Sum:           247
foreach:                105
for:                    58

And here is my benchmark code...

public class MyClass {
    public int Value { get; set; }
    public bool IsValid { get; set; }
}

class Program {

    static void Main(string[] args) {

        const int count = 10000000;
        const int percentageInvalid = 90;

        var rnd = new Random();
        var myCollection = new List<MyClass>(count);
        for (int i = 0; i < count; ++i) {
            myCollection.Add(
                new MyClass {
                    Value = rnd.Next(0, 50),
                    IsValid = rnd.Next(0, 100) > percentageInvalid
                }
            );
        }

        var sw = new Stopwatch();
        sw.Restart();
        int result1 = myCollection.Where(mc => mc.IsValid).Select(mc => mc.Value).Sum();
        sw.Stop();
        Console.WriteLine("Where + Select + Sum:\t{0}", sw.ElapsedMilliseconds);

        sw.Restart();
        int result2 = myCollection.Select(mc => mc.IsValid ? mc.Value : 0).Sum();
        sw.Stop();
        Console.WriteLine("Select + Sum:\t\t{0}", sw.ElapsedMilliseconds);
        Debug.Assert(result1 == result2);

        sw.Restart();
        int result3 = 0;
        foreach (var mc in myCollection) {
            if (mc.IsValid)
                result3 += mc.Value;
        }
        sw.Stop();
        Console.WriteLine("foreach:\t\t{0}", sw.ElapsedMilliseconds);
        Debug.Assert(result1 == result3);

        sw.Restart();
        int result4 = 0;
        for (int i = 0; i < myCollection.Count; ++i) {
            var mc = myCollection[i];
            if (mc.IsValid)
                result4 += mc.Value;
        }
        sw.Stop();
        Console.WriteLine("for:\t\t\t{0}", sw.ElapsedMilliseconds);
        Debug.Assert(result1 == result4);

    }

}

BTW, I concur with the Stilgar's guess: the relative speeds of your two cases vary depending on percentage of invalid items, simply because the amount of job Sum needs to do varies in the "Where" case.

Community
  • 1
  • 1
Branko Dimitrijevic
  • 50,809
  • 10
  • 93
  • 167
1

Rather than try to explain via description, I'm going to take a more mathematical approach.

Given the code below which should approximate what LINQ is doing internally, the relative costs are as follows:
Select only: Nd + Na
Where+Select : Nd + Md + Ma

To figure out the point where they will cross, we need to do a little algebra:
Nd + Md + Ma = Nd + Na => M(d + a) = Na => (M/N) = a/(d+a)

What this means is that in order for the inflection point to be at 50%, the cost of a delegate invocation must be roughly the same as the cost of an addition. Since we know that the actual inflection point was at about 60%, we can work backwards and determine that the cost of a delegate invocation for @It'sNotALie was actually about 2/3 the cost of an addition which is surprising, but that's what his numbers say.

static void Main(string[] args)
{
    var set = Enumerable.Range(1, 10000000)
                        .Select(i => new MyClass {Value = i, IsValid = i%2 == 0})
                        .ToList();

    Func<MyClass, int> select = i => i.IsValid ? i.Value : 0;
    Console.WriteLine(
        Sum(                        // Cost: N additions
            Select(set, select)));  // Cost: N delegate
    // Total cost: N * (delegate + addition) = Nd + Na

    Func<MyClass, bool> where = i => i.IsValid;
    Func<MyClass, int> wSelect = i => i.Value;
    Console.WriteLine(
        Sum(                        // Cost: M additions
            Select(                 // Cost: M delegate
                Where(set, where),  // Cost: N delegate
                wSelect)));
    // Total cost: N * delegate + M * (delegate + addition) = Nd + Md + Ma
}

// Cost: N delegate calls
static IEnumerable<T> Where<T>(IEnumerable<T> set, Func<T, bool> predicate)
{
    foreach (var mc in set)
    {
        if (predicate(mc))
        {
            yield return mc;
        }
    }
}

// Cost: N delegate calls
static IEnumerable<int> Select<T>(IEnumerable<T> set, Func<T, int> selector)
{
    foreach (var mc in set)
    {
        yield return selector(mc);
    }
}

// Cost: N additions
static int Sum(IEnumerable<int> set)
{
    unchecked
    {
        var sum = 0;
        foreach (var i in set)
        {
            sum += i;
        }

        return sum;
    }
}
Jon Norton
  • 2,969
  • 21
  • 20
0

I think it's interesting that MarcinJuraszek's result is different from It'sNotALie's. In particular, MarcinJuraszek's results start out with all four implementations at the same place, while It'sNotALie's results cross over around the middle. I will explain how this works from the source.

Let us assume that there are n total elements, and m valid elements.

The Sum function is pretty simple. It just loops through the enumerator: http://typedescriptor.net/browse/members/367300-System.Linq.Enumerable.Sum(IEnumerable%601)

For simplicity, let's assume the collection is a list. The both Select and WhereSelect will create a WhereSelectListIterator. This means that the actual iterators generated are the same. In both cases, there is a Sum that loops over an iterator, the WhereSelectListIterator. The most interesting part of the iterator is the MoveNext method.

Since the iterators are the same, the loops are the same. The only difference is in the body of the loops.

The body of these lambdas have very similar cost. The where clause returns a field value, and the ternary predicate also returns a field value. The select clause returns a field value, and the two branches of the ternary operator return either a field value or a constant. The combined select clause has the branch as a ternary operator, but the WhereSelect uses the branch in MoveNext.

However, all of these operations are fairly cheap. The most expensive operation so far is the branch, where a wrong prediction will cost us.

Another expensive operation here is the Invoke. Invoking a function takes quite a bit longer than adding a value, as Branko Dimitrijevic has shown.

Also weighing in is the checked accumulation in Sum. If the processor does not have an arithmetic overflow flag, then this could be costly to check as well.

Hence, the interesting costs are: is:

  1. (n + m) * Invoke + m * checked+=
  2. n * Invoke + n * checked+=

Thus, if the cost of Invoke is much higher than the cost of checked accumulation, then the case 2 is always better. If they're about even, then we will see a balance when about half of the elements are valid.

It looks like on MarcinJuraszek's system, checked+= has negligible cost, but on It'sNotALie's and Branko Dimitrijevic's systems, checked+= has significant costs. It looks like it's the most expensive on It'sNotALie's system since the break even point is much higher. It doesn't look like anyone has posted results from a system where the accumulation costs much more than the Invoke.

John Tseng
  • 6,262
  • 2
  • 27
  • 35
  • @It'sNotALie. I don't think anyone has a wrong result. I just couldn't explain some things. I had assumed that the cost of Invoke is much higer than that of +=, but it's conceivable that they could be much closer depending on the hardware optimizations. – John Tseng Aug 21 '13 at 15:21