0

I am trying to find the fastest way to apply some simple calculation on values within an IEnumerable<IPoint>.

IPoint being an interface with two parameters decimal Quantity and decimal Price. I also have a Point class that implements the interface such as :

public class Point : IPoint
{
    public decimal Quantity { get; set; }
    public decimal Price { get; set; }
    public Point(decimal q, decimal p)
    {
        Quantity = q;
        Price = p;
    }
}

I want to transform my IEnumerable<IPoint> such that the result would be an IEnumerable<IPoint> where quantity = price * quantity and price = 1 / price

I compared the results of using IEnumerable.Select and a simple foreach iteration over the elements of the IEnumerable.

I randomly generated a IEnumerable<IPoint> Initial and transform it with the following:

IEnumerable<IPoint> WithSelect = new List<Point>();
IEnumerable<IPoint> WithForeach = new List<Point>();

//With Select
var watch = System.Diagnostics.Stopwatch.StartNew();
WithSelect = Initial.Select(p => new Point(p.Price * p.Quantity, 1 / p.Price));
watch.Stop();
var elapsed1Ms = watch.ElapsedMilliseconds;
var ListTmp = new List<Point>();

//With foreach
watch.Restart();
foreach(var p in Initial)
{
    ListTmp.Add(new Point(p.Price * p.Quantity, 1 / p.Price));
}
WithForeach = ListTmp;
watch.Stop();
var elapsed2Ms = watch.ElapsedMilliseconds;

The performance are:

  • foreach : ~460 ms for 1M points and ~3800ms for 10M points
  • Select : 0ms for 1M and 10M points....

If I do a WithSelect = WithSelect.ToList() to have a more "usable" type (i.e. going from System.Linq.Enumerable.SelectEnumerableIterator<IPoint, Point> to System.Collections.Generic.List<IPoint>) the execution time goes up to ~3900 ms (i.e. slightly more than the for each).

I would say great, let's go for the Select without transforming the result, but I wonder if the execution time (0ms, no matter the size of the Enumerable) is not hiding something. Does it mean that the actual calculation is only made when I access the values ? Would it also mean that if I access several time the values I'll start stacking up the calculation time ? Or is it just a very good Linq optimization that I should not worry about ?

In case the Select is hiding some extra later calculation times, is there an alternative to the two I tried that I should look for ?

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
samuel guedon
  • 575
  • 1
  • 7
  • 21
  • Try adding `.ToList()` after the `.Select` operator, so that both versions are doing equal work. Also change the `WithSelect` and `WithForeach` variables to `IList` instead of `IEnumerable`, to make sure that you get materialized collections, and nor deferred enumerables. – Theodor Zoulias Aug 15 '21 at 15:54
  • @TheodorZoulias my aim is not that they do equal work. My aim is to find the fastest usable way of transforming my IEnumerable and return a IEnumerable. The result of the select is a IEnumerable so it's fine for me (for instance I can do a `foreach(var p in WithSelect)` and p will be a `Point`). Hence my question is about knowing if using the Select in the way I am doing now hides something. – samuel guedon Aug 15 '21 at 16:06
  • Yeap, it hides the concept of [deferred execution](https://stackoverflow.com/questions/7324033/what-are-the-benefits-of-a-deferred-execution-in-linq). Similar question: [Why can LINQ operations be faster than a normal loop?](https://stackoverflow.com/questions/3756535/why-can-linq-operations-be-faster-than-a-normal-loop) I've seen many questions similar to yours in the past, and the above is probably not the most informative and well presented, but I couldn't find a better one right now. – Theodor Zoulias Aug 15 '21 at 16:10
  • @TheodorZoulias I think I get what you mean. The Select is only storing that somthing should happen but did not actualy calculate it. Hence it takes 0ms. Only when I will iterate over my lists or do a ToList() the calculation will actually happens. Saying it like that, it seems not a good thing to keep the Select as such. To complement my understanding, If i keep it like that and then iterates over WithSelect, do the calculation happens every time I run an iteration or once and for all ? – samuel guedon Aug 15 '21 at 16:11
  • I think this helps https://stackoverflow.com/questions/1168944/how-to-tell-if-an-ienumerablet-is-subject-to-deferred-execution – samuel guedon Aug 15 '21 at 16:11
  • Samuel yeap, the query is re-evaluated every time the `IEnumerable` is enumerated. – Theodor Zoulias Aug 15 '21 at 16:14
  • If you want to avoid evaluating the enumerable immediately by using the `ToList` or `ToArray` operators, there is a [`Memoize`](https://github.com/dotnet/reactive/blob/main/Ix.NET/Source/System.Interactive/System/Linq/Operators/Memoize.cs) operator available in the System.Interactive package. This is a thread-safe implementation, so you pay the price of thread-synchronization on every iteration. See also this question: [Is there an IEnumerable implementation that only iterates over it's source (e.g. LINQ) once?](https://stackoverflow.com/questions/12427097) – Theodor Zoulias Aug 15 '21 at 22:00

0 Answers0