I have the following code:
bool b = myList
.All(x => x.MyList
.Where(y => y.MyBool)
.All(y => y.MyList
.All(z => z.MyBool)))
Is this functionally equivalent to:
bool b = myList
.SelectMany(x => x.MyList)
.Where(x => x.MyBool)
.SelectMany(x => x.MyList)
.All(x => x.MyBool)
I think it is, but my colleague has challenged me that this change may be functionally different in certain circumstances (e.g., if any of the collections are empty for instance).
Although the answer is either yes or no, any opinions on this would also be appreciated as to which is better in terms of readability, cyclomatic complexity, time complexity, and performance.
UPDATE:
So, I profiled the code using the following:
static void Main(string[] args)
{
var myList = new List<A>();
for (var j = 0; j < 1000; j++)
{
var a = new A();
for (var k = 0; k < 1000; k++)
{
var b = new B {MyBool = true};
for (var l = 0; l < 1000; l++)
{
var c = new C {MyBool = true};
b.MyList.Add(c);
}
a.MyList.Add(b);
}
myList.Add(a);
}
for (var x = 0; x < 10000; x++)
{
bool b1 = Foo(myList);
}
for (var x = 0; x < 10000; x++)
{
bool b2 = Bar(myList);
}
}
private static bool Foo(List<A> myList)
{
return myList
.All(x => x.MyList
.Where(y => y.MyBool)
.All(y => y.MyList
.All(z => z.MyBool)));
}
private static bool Bar(List<A> myList)
{
return myList
.SelectMany(x => x.MyList)
.Where(x => x.MyBool)
.SelectMany(x => x.MyList)
.All(x => x.MyBool);
}
private class A
{
public List<B> MyList => new List<B>();
}
private class B
{
public bool MyBool { get; set; }
public List<C> MyList => new List<C>();
}
private class C
{
public bool MyBool { get; set; }
}
What I found was that the second method (Bar
) using .SelectMany
and .Where
was almost 80% faster than the first method (Foo
) using nested .All
calls. But this was only provable on a very large dataset and the actual time taken was very small. This might matter more on smaller datasets if each element invokes a query (e.g., to a database) that takes a longer time, if indeed the difference in performance is due to number of times elements are read. But if the difference is due to overhead in between reading elements, and elements are read the same number of times for either method, then I guess the difference in performance will always be negligible regardless of dataset size or element read-time.