How does the Entity Framework retrieve rows using LINQ with a predicate function?

Question

I am new to using entity framework (core 2.0.1) and to my understanding LINQ expressions are mapped through the entity framework to vendor specific SQL query like

from actor in db.Actors
where actor.Name == "Madonna" 
|| actor.Name == "Bruce Lee"
select actor

is mapped to

SELECT *
FROM Actors
WHERE Name = 'Madonna' 
OR Name = 'Bruce Lee';

I have read somewhere quote

EF/Core translates Where query method call to WHERE clause in SQL, and translates the predicate expression tree (again, not predicate function) to the condition in WHERE clause.

Lets say I want to be case-insensitive when matching the actor names and allow spaces before and after the actor names. I would need to use regular expressions for this. I attempted to do the following

public static bool Contains1(Actor actor)
{
    bool result;
    ISet<string> RegexStrSet = new HashSet<string>(
           new String[]{
                @"Bruce\s+Lee",
                @"Madonna"
           }
           );

    result = false; //assumed
    for (IEnumerator<String> criteria = RegexStrSet.GetEnumerator(); criteria.MoveNext() && !result;)
    {
        if (Regex.Match(actor.Name, @"\s*" + criteria.Current + @"\s*", RegexOptions.IgnoreCase).Success)
        {
            result = true;
        }
    }
    return result;
}

static void Main(string[] args)
        {
            using (var db = new ActorDbContext())
            {
                Predicate<Actor> functPred1 = new Predicate<Actor>(Contains1);


                if ((from actor in db.Actors
                     where functPred1(actor)
                     select actor).Count() == 0
                   )
                {
                ...

This does work. I assume now the where predicate doesn't get mapped but is run when the rows of the table are returned so the Database SQL query is now

SELECT *
FROM Actors

My question is this, the entity framework has three options on how it will retrieve the rows. Does it

-Retrieve the entire table and throw a run out of memory runtime exception if the table is too big.

-Retrieve as much as it can handle and do it again until the entire table has been evaluated.

-Retrieve and evaluate one row at a time

If it attempts in one go to retrieve the entire table, what should I do so it doesn't crash.

EDIT: I copied and wired Dixin's Application side logging to see what underlining query being sent to the server for the following below. note: Only the most relevant log snippets were included.

foreach (Actor actor in (from actor in db.Actors
                                         where actor.Name == "Madonna"
                                         || actor.Name == "Bruce Lee"
                                         select actor)){ }

log output snippets:

2018-01-20T21:56:17.0846048+00:00 Information 20101 Microsoft.EntityFrameworkCore.Database.Command
Executed DbCommand (57ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
SELECT [actor].[Id], [actor].[AcademyWinner], [actor].[Age], [actor].[Name]
FROM [Actors] AS [actor]
WHERE [actor].[Name] IN (N'Madonna', N'Bruce Lee')

and

foreach (Actor actor in (from actor in db.Actors
                      where functPred1(actor)
                      select actor)){ }

log output snippets:

2018-01-20T21:56:17.1878434+00:00 Warning 20500 Microsoft.EntityFrameworkCore.Query
The LINQ expression 'where Invoke(__functPred1_0, [actor])' could not be translated and will be evaluated locally.

2018-01-20T21:56:17.2269410+00:00 Information 20101 Microsoft.EntityFrameworkCore.Database.Command
Executed DbCommand (8ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
SELECT [actor].[Id], [actor].[AcademyWinner], [actor].[Age], [actor].[Name]
FROM [Actors] AS [actor]

So it seems the entire table is retrieved from the database server. Wouldn't it have been smarter for the Entity Framework to only retrieve only a subset of the table at one time, evaluate the predicate function on the rows and continue until query has tested every row in the table to prevent run out of memory runtime exception and not hog RAM for other applications.

Entity framework uses IQueryable in normal scenarios which doesn't load the entire table in memory https://www.developerhandbook.com/entity-framework/in-the-spotlight-demystifying-iqueryable-entity-framework-6/. However, in your special case you can use the sql profiler to see what is the actual query hitting the server — Aman B, Jan 20 '18 at 19:54
EF will retrieve all rows matching the condition in one go. it doesn't have an internal query result queue and will only execute one query per IQueryable enumeration (except lazy loading) — DevilSuichiro, Jan 20 '18 at 20:26

Bae · Answer 1 · 2018-01-21T05:05:48.943

My question is this, the entity framework has three options on how it will retrieve the rows. Does it

-Retrieve the entire table and throw a run out of memory runtime exception if the table is too big.

-Retrieve as much as it can handle and do it again until the entire table has been evaluated.

-Retrieve and evaluate one row at a time

Based on your code under Main(), you do not retrieve the entire table. Just like how you translated the LINQ expression to SQL, that is what EF context is doing (filtering through the regex).
Well, it will retrieve all the rows you tried to query in one go.
It will not retrieve and evaluate one row at a time. It will grab all the records that match your where clause.

I don't think you will ever call EF context without some kind of filter/restriction, and of course the program will crash if your machine doesn't have the specs to run it. However, I don't think you will reach over 4GB of data in one table if you practice good entity relationship model.

Don't worry so much about fetching all rows in one table (as you will not find yourself doing this in the first place).

Here are some links to supplement the ideas about performance of EF:

Sorry, I don't see much relation between the question and your answers. The only thing that matters here is that ef-core auto-switches to client-side evaluation because the predicate isn't an `Expression` that can be translated into SQL. I.e.: all entities will be pulled from the database and then (when they're all there) they are filtered. — Gert Arnold, Jan 20 '18 at 21:37

How does the Entity Framework retrieve rows using LINQ with a predicate function?

1 Answers1