1

I have the following code:

bool b = myList
    .All(x => x.MyList
        .Where(y => y.MyBool)
        .All(y => y.MyList
            .All(z => z.MyBool)))

Is this functionally equivalent to:

bool b = myList
    .SelectMany(x => x.MyList)
    .Where(x => x.MyBool)
    .SelectMany(x => x.MyList)
    .All(x => x.MyBool)

I think it is, but my colleague has challenged me that this change may be functionally different in certain circumstances (e.g., if any of the collections are empty for instance).

Although the answer is either yes or no, any opinions on this would also be appreciated as to which is better in terms of readability, cyclomatic complexity, time complexity, and performance.

UPDATE:

So, I profiled the code using the following:

static void Main(string[] args)
{
    var myList = new List<A>();

    for (var j = 0; j < 1000; j++)
    {
        var a = new A();

        for (var k = 0; k < 1000; k++)
        {
            var b = new B {MyBool = true};

            for (var l = 0; l < 1000; l++)
            {
                var c = new C {MyBool = true};
                b.MyList.Add(c);
            }

            a.MyList.Add(b);
        }

        myList.Add(a);
    }

    for (var x = 0; x < 10000; x++)
    {
        bool b1 = Foo(myList);
    }

    for (var x = 0; x < 10000; x++)
    {
        bool b2 = Bar(myList);
    }
}

private static bool Foo(List<A> myList)
{
    return myList
        .All(x => x.MyList
            .Where(y => y.MyBool)
            .All(y => y.MyList
                .All(z => z.MyBool)));
}

private static bool Bar(List<A> myList)
{
    return myList
        .SelectMany(x => x.MyList)
        .Where(x => x.MyBool)
        .SelectMany(x => x.MyList)
        .All(x => x.MyBool);
}

private class A
{
    public List<B> MyList => new List<B>();
}

private class B
{
    public bool MyBool { get; set; }

    public List<C> MyList => new List<C>();
}

private class C
{
    public bool MyBool { get; set; }
}

What I found was that the second method (Bar) using .SelectMany and .Where was almost 80% faster than the first method (Foo) using nested .All calls. But this was only provable on a very large dataset and the actual time taken was very small. This might matter more on smaller datasets if each element invokes a query (e.g., to a database) that takes a longer time, if indeed the difference in performance is due to number of times elements are read. But if the difference is due to overhead in between reading elements, and elements are read the same number of times for either method, then I guess the difference in performance will always be negligible regardless of dataset size or element read-time.

Results below (from Visual Studio Performance Profiler): enter image description here

Neo
  • 4,145
  • 6
  • 53
  • 76
  • 1
    Since `All` returns `true` for empty sets, they are equivalent. What about which is better, it's either opinion based or implementation specific. – Ivan Stoev Nov 30 '16 at 11:15

1 Answers1

4
myList.All // is it true for all elements in myList that…
(x => x.MyList //in their MyList property
.Where(y => y.MyBool) // those elements that have MyBool returning true
.All( // have it true for all elements in that list that…
y => y.MyList //in their MyList property
.All(z => z.MyBool) // all elements have MyBool returning true


myList.SelectMany( // for all the elements in myList
x => x.MyList)  // for all elements in their MyList property…
.Where(x => x.MyBool) // that have MyBool returning true
.SelectMany( // for all those elements
x => x.MyList) // for all elements in their MyList property
.All(x => x.MyBool) // is it true that all elements have MyBool returning true

So yes, they have the same meaning. In particular with either case an empty list at any stage means true from the All() method, whether true coming from emptiness is passed up to a calling All() or the emptiness passed into a final All().

Readability is a more subjective matter, since the very question asked involves a few step of if-this-then-that which is just inherently cumbersome and so will lead to a cumbersome expression. I favour the first, but not heavily or dogmatically.

The time complexity is the same. The performance is likely to be much of a muchness. The internals of the second would at first glance seem to chain enumerators more which could make it slightly slower, but I wouldn't bet a lot of money on that; if I cared heavily about performance here I'd definitely profile both.

Jon Hanna
  • 110,372
  • 10
  • 146
  • 251
  • Thanks. In terms of readability, I've just made a slight change to the code by adding carriage returns and indentation to make it more realistic. Now, what do you think? – Neo Nov 30 '16 at 11:19
  • 1
    It certainly helps break down the steps in each, but does so about evenly in each, so with them both being improved equally I'm still very slightly favouring the first, but only slightly and not expecting that everyone would agree on that. – Jon Hanna Nov 30 '16 at 11:22
  • I just profiled the two methods and I pasted the results in the Question as an update. It seems the second method performs better. But I think it's so negligible regardless of dataset type or size that it's not worth considering in reality. But I could be wrong. – Neo Nov 30 '16 at 13:22
  • 2
    You're probably right, it's slightly faster, but not so much that makes much difference. Did you profile desktop framework or dotnet-core? Since the latter has received more optimisation attention, I'd be interested to learn if they're different, especially since I did some of that work myself, though I don't think I did anything with either All or SelectMany. – Jon Hanna Nov 30 '16 at 20:55
  • Just tried this, but unfortunately, all the Performance Profiler tools except for 'GPU Usage' are disabled in the .NET Core app. Same problem as [here](http://stackoverflow.com/q/36908346/197591). – Neo Dec 01 '16 at 09:59