Clearly, you understand why it's better to avoid using AsEnumerable()
the way you do in your question.
Also, some of the other answers have made it very clear why calling AsEnumerable()
changes the way the query is performed and read. In short, it's because you are then invoking IEnumrable<T>
extension methods rather than the IQueryable<T>
extension methods, the latter allowing you to combine predicates before executing the query in the database.
However, I still feel that this doesn't answer your actual question, which is a legitimate question. You said (emphasis mine):
What I mean is that this code:
context.Logs.AsEnumerable().Where(x => x.Id % 2 == 0).Take(100).ToList();
will download all the rows from the table before passing any row to the Where()
method and there could be millions of rows in the table.
My question to you is: what made you conclude that this is true?
I would argue that, because you are using IEnumrable<T>
instead of IQueryable<T>
, it's true that the query being performed in the database will be a simple:
select * from logs
... without any predicates, unlike what would have happened if you had used IQueryable<T>
to invoke Where
and Take
.
However, the AsEnumerable()
method call does not fetch all the rows at that moment, as other answers have implied. In fact, this is the implementation of the AsEnumerable()
call:
public static IEnumerable<TSource> AsEnumerable<TSource>(this IEnumerable<TSource> source)
{
return source;
}
There is no fetching going on there. In fact, even the calls to IEnumerable<T>.Where()
and IEnumerable<T>.Take()
don't actually start fetching any rows at that moment. They simply setup wrapping IEnumerable
s that will filter results as they are iterated on. The fetching and iterating of the results really only begins when ToList()
is called.
So when you say:
Couldn't EF do on demand loading of rows like you can with plain ADO.NET using Read()
method of SqlDataReader
and save time and bandwidth?
... again, my question to you would be: doesn't it do that already?
If your table had 1,000,000
rows, I would still expect your code snippet to only fetch up to 100
rows that satisfy your Where
condition, and then stop fetching rows.
To prove the point, try running the following little program:
static void Main(string[] args)
{
var list = PretendImAOneMillionRecordTable().Where(i => i < 500).Take(10).ToList();
}
private static IEnumerable<int> PretendImAOneMillionRecordTable()
{
for (int i = 0; i < 1000000; i++)
{
Console.WriteLine("fetching {0}", i);
yield return i;
}
}
... when I run it, I only get the following 10 lines of output:
fetching 0
fetching 1
fetching 2
fetching 3
fetching 4
fetching 5
fetching 6
fetching 7
fetching 8
fetching 9
It doesn't iterate through the whole set of 1,000,000
"rows" even though I am chaining Where()
and Take()
calls on IEnumerable<T>
.
Now, you do have to keep in mind that, for your little EF code snippet, if you test it using a very small table, it may actually fetch all the rows at once, if all the rows fit within the value for SqlConnection.PacketSize. This is normal. Every time SqlDataReader.Read()
is called, it never only fetches a single row at a time. To reduce the amount of network call roundtrips, it will always try to fetch a batch of rows at a time. I wonder if this is what you observed, and this mislead you into thinking that AsEnumerable()
was causing all rows to be fetched from the table.
Even though you will find that your example doesn't perform nearly as bad as you thought, this would not be a reason not to use IQueryable
. Using IQueryable
to construct more complex database queries will almost always provide better performance, because you can then benefit from database indexes, etc to fetch results more efficiently.